Assay for the detection of factors that modulate the expression of ingap

ABSTRACT

A reporter construct contains mammalian INGAP 5′-regulatory region or a fragment thereof, a minimal promoter element from mammalian INGAP or a heterologous promoter, and a reporter gene. The reporter construct can be used to screen for agents which alone or in combination up-regulate or down-regulate reporter gene expression. Alternatively, the reporter construct can be used to screen for agents that bind to the hamster INGAP 5′-regulatory region or a fragment thereof.

This application is a continuation of U.S. patent application Ser. No. 12/062,740, filed on which is a continuation of U.S. patent application Ser. No. 10/339,767 filed on Jan. 9, 2003, now U.S. Pat. No. 7,355,024, which claims priority to provisional applications: 60/388,315 filed on Jun. 14, 2002, provisional application 60/361,073 filed on Mar. 1, 2002 and provisional application 60/346,898 filed on Jan. 11, 2002, the contents of which are incorporated by reference.

REFERENCE TO SEQ ID

The Sequence listing in “1247090060.tx” created on Aug. 1, 2011, being 108 KB in size, is incorporated by reference.

FIELD OF THE INVENTION

The invention relates to the field of assays for the detection of factors that modulate gene expression. Specifically, the invention relates to reporter constructs and methods for identifying agents that modulate the expression of the INGAP gene.

BACKGROUND OF THE INVENTION

Islet neogenesis gene associated protein (INGAP protein) has been identified as a pancreatic acinar cell protein that can induce islet cell neogenesis from progenitor cells resident in the pancreas in a manner that recapitulates islet development during normal embryogenesis. INGAP is unique in its ability to stimulate growth and differentiation of islets of Langerhans from precursor cells associated with pancreas. These islets evolve a mature insulin secretory profile capable of responding to perturbations in blood glucose in a physiologic manner. This potential anti-diabetic therapeutic has been shown to demonstrate homology across several species and to exert a biological response.

Pancreatic islet cell mass is lost in type 1 diabetes mellitus, a disease in which a progressive autoimmune reaction results in the selective destruction of insulin-producing β-cells. In type 2 diabetes mellitus, so-called adult-onset disease, but also increasingly a condition in young overweight people, the β-cell mass may be reduced by as much as 60% of normal. The number of functioning β-cells in the pancreas is of critical significance for the development, course, and outcome of diabetes. In type I diabetes, there is a reduction of β-cell mass to less than 2% of normal. Even in the face of severe insulin resistance as occurs in type II diabetes, the development of diabetes only occurs if there is inadequate compensatory increase in β-cell mass. Thus, the development of either of the major forms of diabetes can be regarded as a failure of adaptive β-cell growth and a subsequent deficiency in insulin secretion. Stimulating the growth of islets and β-cells from precursor cells, known as islet neogenesis, is an attractive approach to the amelioration of diabetes. There is need in the art for methods to identify agents that can modulate the expression of INGAP, whether in animals or in cultured cells.

BRIEF SUMMARY OF THE INVENTION

It is an object of the invention to provide a reporter construct containing the 5′-regulatory region from mammalian INGAP gene.

It is another object of the invention to provide methods for identifying agents which modulate INGAP expression.

It is another object of the invention to provide a nucleic acid or fragment of INGAP 5′-regulatory region.

It is another object of the invention to provide methods for increasing INGAP expression.

It is another object of the invention to provide a kit for modulating INGAP expression.

These and other objects of the invention are provided by one or more of the embodiments described below.

In one aspect of the invention a reporter construct is provided. The reporter construct comprises a regulatory region nucleotide sequence and a nucleotide sequence encoding a detectable product. In one aspect of the invention, the reporter construct is provided in a vector. The regulatory region nucleotide sequence is linked to the nucleotide sequence encoding a detectable product. The regulatory region nucleotide sequence may comprise one or more fragments of 5′ regulatory region of the INGAP genomic sequence, SEQ ID NO: 23, or it may comprise the entire length of the 5′ regulatory region. In one embodiment of the reporter construct, a promoter element is interposed between the regulatory region nucleotide sequence and the nucleotide sequence encoding a detectable product. The promoter element may be selected from the promoter elements present in the INGAP regulatory sequence. Alternatively, the promoter element present in the vector comprising the reporter construct may be used. The detectable product encoded by the said nucleotide sequence encoding a detectable product could be either a nucleic acid or a protein. The detectable product need not be the INGAP gene nucleic acid or protein.

In another embodiment of the invention, a method identifying agents that modulate INGAP expression is provided. The method comprises contacting a cell with a test agent, wherein the cell comprises a reporter construct of the present invention. Expression of the detectable nucleic acid or protein product in the cell is determined. A test agent is identified as a modulator of INGAP expression if the test agent modulates expression of the detectable product in the cell.

In another embodiment of the invention, an isolated nucleic acid comprising the genomic sequence of the hamster INGAP gene (SEQ ID NO: 2), or a fragment thereof is provided.

According to another embodiment of the invention, an in vitro method for identifying agents that modulate INGAP expression is provided. The method comprises contacting a test agent with a reporter construct of the present invention in a cell-free system that allows for transcription and translation of a nucleotide sequence. Expression of the detectable product is determined. The substance is identified as a modulator of INGAP expression if the test substance modulates expression of the detectable product.

According to another embodiment of the invention, an in vitro method for identifying an agent that modulate INGAP expression is provided. The method comprises contacting a test agent with a nucleic acid of the invention. Binding of the test agent to the nucleic acid is determined. The test agent is identified as a modulator of INGAP expression if the test agent binds to the nucleic acid.

According to another embodiment of the invention a method for increasing INGAP expression is provided. An effective amount of a factor that stimulates INGAP expression directly or indirectly, for example cytokines, chemokines, growth factors, or pharmacological agents, is administered to a mammal in need of increased INGAP expression.

According to another embodiment of the invention a kit for modulating INGAP expression is provided. The kit comprises a modulator of INGAP expression and instructions for using the modulator of INGAP expression to modulate INGAP expression.

According to another embodiment of the invention a method for modulating INGAP expression in a mammal to treat a disease state related to reduced islet cell function is provided. The method comprises the step of administering to the mammal an effective amount of a modulator of INGAP expression whereby the level of INGAP expression in the mammal is modified.

All documents cited are, in relevant part, incorporated herein by reference; the citation of any document is not to be construed as an admission that it is prior art with respect to the present invention.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows the annotation of the hamster INGAP gene structure. The boundaries of introns 1-5 are listed in Table 1.

FIG. 2 shows an overview of the 5′-regulatory region of the hamster INGAP gene (nucleotides 1-3137 of SEQ ID NO: 2) showing many well known and well-characterized transcription factor binding sites. The minimal promoter element contains the regions noted with an underline (CAAT-box, TATA-box, and GC-box).

FIG. 3 shows a schematic of many well known and well-characterized transcription factor-binding sites for nucleotides 1-3123 of the 5′-regulatory region (SEQ ID NO: 1) of the hamster INGAP gene. Table 3 further describes these transcription factor-binding sites.

FIG. 4 shows the predicted transcription start sites within the 5′-regulatory region (SEQ ID NO: 1) of the hamster INGAP gene (SEQ ID NO: 2). The predicted start site is indicated by a boldface nucleotide. The start and end nucleotide numbers are indicated for the promoter sequence. The numbers refer to nucleotide numbers of the hamster INGAP gene (SEQ ID NO: 2)

FIG. 5 shows the adapter primer structure and sequence used in gene walking. Adapter primer 1 (AP1) and adapter primer 2 (AP2) are shown.

FIGS. 6 and 7 show the strategy for reconstructing the hamster INGAP gene. The hamster INGAP gene was reconstructed using the technique of gene walking. Shown are the fragments and the gene specific primers (GSP1 and GSP2) used in PCR amplification for gene walking. Fragments were joined together using unique restriction enzyme sites within each fragment. The nucleotide sequences of the individual primers are listed in Table 2.

FIG. 8 shows the fragments of INGAP 5′-regulatory region, which were cloned into pβGal-basic upstream of a β-galactosidase reporter gene. The labels on the left refer to the nucleotide fragments of SEQ ID NO: 23 which were cloned upstream of pβGal-basic.

FIG. 9A shows reporter activity in human embryonic kidney cells (293T) transfected with a reporter construct that contains various fragments of the 5′-regulatory region (SEQ ID NO: 23) of hamster INGAP DNA cloned upstream of a β-galactosidase reporter gene (pβGal-basic), or in a reporter construct which contains no INGAP DNA. The cells are stimulated with phorbol myristate acetate. Promoter activity is assessed by determining the level of β-galactosidase present in the cell using a β-galactosidase luminescent assay.

FIG. 9B shows reporter activity in human embryonic kidney cells (293T) transfected with a reporter construct that contains nucleotides 2030 to 3137 of the 5′-regulatory region (SEQ ID NO: 23) of hamster INGAP cloned upstream of a β-galactosidase reporter gene, or in a reporter construct which contains no INGAP DNA. The cells are stimulated with leukemia inhibitory factor. Promoter activity is assessed by determining the level of β-galactosidase present in the cell using a β-galactosidase luminescent assay.

FIG. 10 shows the reporter activity in human embryonic kidney cells (293T) transfected with a reporter construct that contains different fragments (see FIG. 8) of the 5′-regulatory region of hamster INGAP cloned upstream of a β-galactosidase reporter gene. The cells are stimulated with phorbol myristate acetate. Concentrations of PMA used are 6 ng/ml, 17 ng/ml, 50 ng/ml, 100 ng/ml, or 300 ng/ml. Promoter activity is assessed by determining the level of β-galactosidase present in the cell using a β-galactosidase luminescent assay.

FIG. 11 shows reporter activity in human embryonic kidney cells (293T) transfected with a reporter construct that contains different fragments (see FIG. 8) of the 5′-regulatory region of hamster INGAP cloned upstream of a β-galactosidase reporter gene. The cells are stimulated with human leukemia inhibitory factor (hLIF). Concentrations of hLIF used are 1 ng/ml, 10 ng/ml, or 30 ng/ml. Promoter activity was assessed by determining the level of β-galactosidase present in the cell using a β-galactosidase luminescent assay.

FIG. 12 shows RNA analysis for INGAP gene upregulation in rat amphicrine pancreatic cells, AR42J, treated with cytokine IL-6 or untreated. Total RNA is probed by Northern analysis for INGAP gene.

DETAILED DESCRIPTION OF THE INVENTION Definitions

It must be noted that as used herein and in the appended claims, the singular forms “a”, “an”, and “the” include plural references unless the context clearly dictates otherwise.

The term “promoter” is used to define the region of a gene at which initiation and rate of transcription are controlled. It contains the site at which RNA polymerase binds and also sites for the binding of regulatory proteins, e.g. transcription factors, repressors, etc. In order to differentiate between the transcription initiation site and other sites that modulate rate of transcription, promoter region is generally subdivided into “minimal promoter element” and “regulatory region”. The term “minimal promoter element” or sometimes simply referred to as “promoter” therefore may include TATA box, GC-rich sequence and CAAT box; while “regulatory region” is usually a long stretch of nucleotide sequence where transcription factors and other factors bind. Most eukaryotic genes have long regulatory regions where many different transcription factors bind. The expression or the lack of expression of a given gene in a given cell type, tissue, organ, or an organism is governed by the interactions that take place on its regulatory region.

The term “transcription factor” is used to describe the proteins that bind short stretches of DNA in the regulatory regions of a gene. Transcription factors may interact with each other as well as RNA polymerase. Thus, transcription factors may bind hormones or second messengers, DNA, RNA, other transcription factors, or other proteins. They may activate or inhibit transcription of a given gene. Transcription factors are also sometimes referred to as “enhancers” or “repressors”. Transcription factor binding sites can be used to identify agents that bind to the 5′-regulatory region of the gene and modulate the gene's expression.

The term “reporter” is used to describe a coding sequence attached to a heterologous promoter or enhancer elements and whose product, either nucleic acid or protein, is easily detected and is quantifiable. Some common reporter genes include β-galactosidase (lacZ), chloramphenicol acetyltransferase (cat), β-glucuronidase (GUS), and green fluorescent protein (GFP).

A “reporter construct” is a piece of nucleic acid that includes a promoter element and a reporter gene housed in a suitable vector plasmid DNA. Regulatory region nucleotide sequences may be cloned 5′ of the promoter element to determine if they contain transcription factor binding sites. The reporter construct-containing vector is introduced into a cell that contains many transcription factors. Activation of the reporter gene by transcription factors may be monitored by detection and quantification of the product of the reporter gene.

The term “agent” is used here to essentially describe any means to modulate INGAP expression. Agent may be a chemical compound, a biological agent, or a physical force, a mechanical contraption, or any combinations thereof.

INGAP Promoter and Regulatory Region

It is a discovery of the present inventors that INGAP gene is regulated by a 5′-regulatory region that is susceptible to modulation by many known transcription factors, including PMA and LIF.

It is a further discovery of the present invention that the 5′-regulatory region nucleotide sequence of the INGAP gene may be used in screening assays to identify agents capable of modulating the INGAP gene expression. These modulating agents have potential as therapeutic agents for treating pathological conditions including, but not limited to, diabetes mellitus, both type 1 and type 2, endocrine and non-endocrine hypoplasia, hypertrophy, adenoma, neoplasia, and nesidioblastosis.

Mammalian INGAP, like most genes, has a 5′-regulatory region followed by introns and exons. The sequence of a mammalian (Hamster sp.) INGAP gene is provided as SEQ ID NO: 2. FIG. 1 details the relative location of the 5′-regulatory region, the introns and the exons of the hamster INGAP gene. The boundaries of introns 1-5 and the location of the TATA-box and the poly-A signal are listed in Table 1.

TABLE 1 Position In INGAP Description Gene (SEQ ID NO: 2) TATA-Box 3094 INTRON 1 3150-3426 INTRON 2 3508-4442 INTRON 3 4562-4735 INTRON 4 4874-5459 INTRON 5 5587-5843 Poly-A Signal 6098-6103

The nucleotide sequence of the 5′-regulatory region including the promoter elements of mammalian INGAP, is shown partially in SEQ ID NO: 1, and completely in SEQ ID NO: 2 and 23 (nucleotides 1-3137 of SEQ ID NO: 2). Nucleotides 1-3120 of SEQ ID NO: 1 are identical to nucleotides 1-3120 of SEQ ID NO: 2 and SEQ ID NO: 23. An overview of the 5′-regulatory region is shown in FIG. 2. Representative transcription enhancer/repressor binding sites are shown also in FIG. 2. Predicted transcription enhancer/repressor binding sites for nucleotides 1-3123 of the 5′-regulatory region are shown in FIG. 3. Table 3 at the end of the specification details these transcription factors and their binding sites, and their locations in the regulatory region. Potential transcription factor binding analysis was done using MatInspector Professional™, which is a bioinformatics software that utilizes a library of matrix descriptions for transcription factor binding sites to locate matches in sequences of unlimited length (Quandt, K., Frech, K., Karas, H., Wingender, E., Werner, T. (1995) Nucleic Acids Res. 23, 4878-4884).

Table 3 lists predicted binding proteins (Further Information) based upon their classification into functionally similar matrix families (Family/matrix). The DNA sequence predicted to bind the protein (Sequence), whether sense or antisense DNA (Str) and location of the sequence in SEQ ID NO: 2, (Position) are listed. Further the similarity to the consecutive highest conserved nucleotides of a matrix (Core sim.) and similarity to all nucleotides in that matrix (Matrix sim.) along with the optimized value (Opt) defined in a way that a minimum number of matches is found in non-regulatory test sequences are also listed. Details to the algorithms used in MatInspector Professional™ is referenced:

OPT: This matrix similarity is the optimized value defined in a way that a minimum number of matches are found in non-regulatory test sequences (i.e. with this matrix similarity the number of false positive matches is minimized). This matrix similarity is used when the user checks “Optimized” as the matrix similarity threshold for MatInspector Professional™.

Family: Each matrix belongs to a so-called matrix family, where functionally similar matrices are grouped together, eliminating redundant matches by MatInspector Professional™ professional (if the family option was selected). E.g. the matrix family V$NFKB includes 5 similar matrices for NFkappaB (V$NFKAPPAB.01, V$NFKAPPAB 0.02, V$NFKAPPAB 0.03, V$NFKAPPAB50.01, V$NFKAPPAB65.01) as well as 1 matrix for the NFkappaB related factor c-Rel (V$CREL.01).

Matrix: The MatInspector Professional™ matrices have an identifier that indicates one of the following seven groups: vertebrates (V$), insects (I$), plants (P$), fungi (F$), nematodes (N$), bacteria (B$), and other functional elements (O$); followed by an acronym for the factor the matrix refers to, and a consecutive number discriminating between different matrices for the same factor. Thus, V$OCT1.02 indicates the second matrix for vertebral Oct-1 factor.

Core Sim: The “core sequence” of a matrix is defined as the (usually 4) consecutive highest conserved positions of the matrix. The core similarity is calculated as described here. The maximum core similarity of 1.0 is only reached when the highest conserved bases of a matrix match exactly in the sequence. More important than the core similarity is the matrix similarity which takes into account all bases over the whole matrix length.

Matrix Sim: The matrix similarity is calculated as described here. A perfect match to the matrix gets a score of 1.00 (each sequence position corresponds to the highest conserved nucleotide at that position in the matrix), a “good” match to the matrix usually has a similarity of >0.80. Mismatches in highly conserved positions of the matrix decrease the matrix similarity more than mismatches in less conserved regions.

Another aspect of the invention provides for a reporter construct. Reporter constructs contain a 5′ regulatory region nucleotide sequence fragment of SEQ ID NO: 23 (e.g., an enhancer and/or repressor binding site containing region), a promoter element (which may or may not be from INGAP regulatory region nucleotide sequence, SEQ ID NO: 23), and a reporter gene. The 5′-regulatory region nucleotide sequence is positioned upstream of the reporter gene. In order to determine the identity of various transcription factors that bind the 5′ regulatory region nucleotide sequence and to elucidate their binding locations within the 5′ regulatory nucleotide sequence of the INGAP gene, the region may be mapped using deletion analysis. One or more fragments of the regulatory region nucleotide sequence may be initially analyzed for their responses to various transcription factor activators. Once, a region of interest is determined, further fine mapping may be carried out where DNA from different locations within the regulatory region could be combined to make a more robust, and responsive reporter construct. DNA sequences, such as INGAP 5′-regulatory region DNA or a fragment thereof, can be manipulated by methods well known in the art. Examples of such techniques include, but are not limited to, polymerase chain reaction (PCR), restriction enzyme endonuclease digestion, ligation, and gene walking. Cloning fragments of DNA, such as 5′-regulatory regions is well known in the art.

Another approach to quantify the expression levels of a gene is to measure transcription of the gene. PCR-ELISA may be used to capture transcripts onto a solid phase using biotin or digoxigenin-labelled primers, oligonucleotide probes (oligoprobes) or directly after incorporation of the digoxigenin into the transcripts (Watzinger, F. and Lion, T. (2001) Nucleic Acids Res., 29, e52). Once captured, the transcripts can be detected using an enzyme-labeled avidin or anti-digoxigenin reporter molecule similar to a standard ELISA format. Another approach is to employ real-time PCR to detect the transcript of the reporter gene (Mackay, I. M. and Nitsche, A., Nucleic Acids Res. 2002 Mar. 15; 30 (6), 1292-305). In real-time PCR fluorogenic nucleotides are used and progress of the transcript is monitored in real-time as the polymerase transcribes the reporter gene.

The promoter element in the reporter construct may or may not be from the same gene as the 5′-regulatory region. As an example, the enhancer/repressor region from the INGAP 5′-regulatory region, or a fragment of the enhancer/repressor region from the INGAP 5′-regulatory region, may be cloned upstream of a heterologous minimal promoter element, e.g., the minimal CMV promoter (Boshart et al., 1985) and the promoters for TK (Nordeen, 1988), IL-2, and MMTV.

Transcription of a gene begins around the minimal promoter. FIG. 4 shows the predicted transcription start sites for mammalian INGAP gene (SEQ ID NO: 2). SEQ ID NO: 2 was analyzed using “Neural Network Promoter Prediction” program designed by Martin Reese to identify eukaryotic promoter recognition elements such as TATA-box, GC-box, CAAT-box, and the transcription start site. These promoter elements are present in various combinations separated by various distances in sequence. The program is available on the Internet and is located at http://www.fruitfly.org/seq_tools/promoter.html.

The reporter construct can be used to identify agents that modulate, either alone or in combination, the expression of INGAP. Some such agents may modulate expression of INGAP by binding to the regulatory region directly while others may regulate expression of transcription factors that bind to the INGAP regulatory region.

The reporter construct can be transfected into a host cell in vitro, or in vivo through the pancreatic duct, either transiently or stably, and a test agent introduced to the assay system. Examples of test agents include, but are not limited to organic and inorganic chemical agents, carbohydrates, proteins, oligonucleotides, cholecystokinin, mechanically induced pressure, and agents which cause a pancreatic duct obstruction. Expression of the reporter gene product can be determined by an assay appropriate for the reporter gene employed. Examples of such assays include, but are not limited to a luminescent assay for β-galactosidase or luciferase, an enzymatic assay for chloramphenicol acetyl transferase, and fluorescence detection for fluorescent proteins. Such assays are well known in the art, and a skilled artisan will be able to select an appropriate assay for the chosen reporter. A test agent is identified as a modulator of INGAP expression if the test agent modulates expression of the reporter gene product. Preferably the level of increase or decrease is at least 50%, 100%, 200%, 500%, or 1000%, but any statistically significant change can be an indicator of modulatory activity. A skilled artisan may also determine reporter gene product expression in untreated cells, and in treated and untreated cells transfected with a promoter-less reporter gene only. Such determinations can be used to determine background levels of expression.

Test agents can also be obtained by fractionating pancreatic secretion fluids. A pancreatic duct obstruction can be used as an exemplary method of harvesting pancreatic secretion fluids. The pancreatic secretion fluids can be fractionated by methods well known in the art. Examples include high-pressure liquid chromatography (HPLC), size exclusion chromatography, hydrophobic interacting columns, and density gradient centrifugation. Individual fractions can be tested for agents that modulate reporter gene expression using a method described herein. The individual fractions can be further fractionated to identify agents that modulate reporter gene expression. The identified test agents can be used to modulate the expression of INGAP.

A host cell can be any cell suitable for transfection and maintenance in a suitable assay system. Examples of suitable cells include, but are not limited to, mammalian cells, human cells, mouse cells, rat cells, monkey cells, dog cells, bovine cells, and porcine cells. Preferably the cells used will be human cells. The cells could be either transformed cells line or primary cells. Whole organ explants may also be used where the regulation may be monitored over many different cell types. Many methods exist in the art for transfecting or infecting cells with reporter construct DNA. Such methods include, but are not limited to, lipofection, electroporation, calcium phosphate precipitation, DEAE dextran, gene guns, and modified viral techniques (e.g., recombinant adenovirus or recombinant retrovirus). The skilled artisan can readily choose a method suitable for use with a given cell type and assay system.

The reporter construct can also be introduced in vivo directly into cells of the pancreas. Examples of methods to introduce the reporter construct into pancreatic cells in vivo include pancreatic duct retrograde perfusion and in vivo electroporation (Mir, 2001). The reporter construct encodes a reporter gene product that is readily measured in vivo. A test agent can be administered systemically or locally, and expression of the reporter gene in vivo can be determined by an assay appropriate for the particular reporter employed. Examples of such include a fluorescence assay for green fluorescent protein.

Methods for identifying agents that modulate INGAP expression can also be accomplished in vitro. The reporter construct can be contacted with a test agent in vitro under conditions sufficient for transcription and/or translation of the reporter gene. Components such as rabbit reticulocyte lysates or wheat germ extracts can be utilized for such a method. Subsequently, the expression level of the reporter gene can be determined as described above utilizing an appropriate assay for a given reporter gene. A test agent is identified as a modulator of INGAP expression if the test agent modulates expression of the reporter gene. Threshold levels of change can be set by the practitioner as discussed above.

A test agent can alternatively be contacted with an isolated and purified INGAP 5′-regulatory DNA molecule and one can determine if the test agent binds to the DNA molecule. Test agents can be a chemical agent, a protein, or a nucleic acid. Appropriate INGAP 5′-regulatory DNA molecules would include nucleotides 1-6586 of SEQ ID NO: 2, the 5′-regulatory region DNA (SEQ ID NO: 1, or SEQ ID NO: 23), or any fragment of the 5′-regulatory region, preferably a fragment which contains one or more enhancer/repressor binding sites. Methods to determine binding of the test agent to the fragment of DNA are well known in the art, e.g., electrophoretic mobility shift assay (EMSA). See for example Sambrook et al., MOLECULAR CLONING: A LABORATORY MANUAL, 2d ed., 1989, at pages 9.50-9.51. Fragments of the 5′-regulatory region can be obtained by methods well known in the art using the disclosed sequence (SEQ ID NO: 2). Examples of such methods include, PCR, restriction enzyme digestion, and chemical synthesis. Any fragment of DNA within the 5′-regulatory region (SEQ ID NO: 1, or 23) can be used. The exact location that an agent binds can be determined for example by utilizing smaller fragments to map precisely the binding site for the test agent. Test agents that bind in the assay can be further tested in other assays that require modulatory activity.

An agent that causes an increase or decrease in reporter gene expression can be used as a modulator of INGAP expression. The modulator can be administered to a mammal in need of such modulation. Examples of mammals that may need INGAP expression modulation are those with reduced pancreatic function, in particular reduced islet cell function. Such mammals include those who have diabetes mellitus, impaired glucose tolerance, impaired fasting glucose, hyperglycemia, obesity, and pancreatic insufficiency.

An agent that is identified as a modulator of INGAP expression can be supplied in a kit to treat diseases associated with reduced islet cell function. The kit would comprise in single or divided containers, in single or divided doses a modulator of INGAP expression. Written instructions may be included for using the modulator of INGAP expression. The instructions may simply refer a reader to another location such as a website or other information source.

Agents that cause an increase in reporter gene expression can be used to increase INGAP expression to treat a disease state related to reduced islet cell function. Agents that cause a decrease in reporter gene expression can be used to decrease INGAP expression to treat a disease state related to hyperactivity of islet cells or a disease where reduced INGAP expression is desirable. Examples of such agents include, but are not limited to, PMA, LIF, interleukin-6, Oncostatin M, and ciliary neurotropic factor. Agents can be administered by any number of routes including, but not limited to, oral, intravenous, intramuscular, intra-arterial, intramedullary, intrathecal, intraventricular, transdermal, subcutaneous, intraperitoneal, intranasal, parenteral, topical, sublingual, rectal, or pancreatic duct retrograde perfusion. Agents for oral administration can be formulated using pharmaceutically acceptable carriers well known in the art in dosages suitable for oral administration. Such carriers enable the pharmaceutical compositions to be formulated as tablets, pills, dragees, capsules, liquids, gels, syrups, slurries, suspensions, and the like, for ingestion by the mammal.

Agents for intravenous, intramuscular, intra-arterial, transdermal, and subcutaneous injections can be formulated using pharmaceutically acceptable carriers well known in the art in dosages suitable for injection into the mammal. Agents for intranasal, topical, and rectal administration can be formulated using pharmaceutically acceptable carriers well known in the art in dosages suitable for surface administration to the mammal. Mammals in need of an increase in INGAP expression include for example, mammals with diabetes mellitus, impaired glucose tolerance, impaired fasting glucose, hyperglycemia, obesity, and pancreatic insufficiency. Mammals in need of a decrease in INGAP expression include for example, mammals with hypoglycemia.

The following examples are offered by way of illustration and do not limit the invention disclosed herein.

EXAMPLES Example 1 Hamster INGAP Genomic Sequence and Structure

The hamster INGAP genomic sequence and structure was determined by gene walking (Clontech) and DNA sequencing. Gene walking is a method for walking upstream toward a promoter or downstream in genomic DNA from a known sequence, such as cDNA. This method utilizes four uncloned, adapter-ligated genomic fragment libraries. The manufacturer's recommended protocol is followed with one notable exception; hamster genomic DNA was used to create the uncloned, adapter-ligated genomic fragment libraries.

To create uncloned, adapter ligated genomic fragment libraries, genomic DNA was purified from hamster cells. Four separate aliquots were thoroughly digested with PvuII, StuI, DraI, or EcoRV. Following digestion, inactivation of the restriction enzymes, and dephosphorylation, each separate pool of DNA fragments was ligated to an adapter AP1 SEQ ID NO: 489 or AP2 SEQ ID NO: 490, see FIG. 5. The adapter was phosphorylated to provide the requisite phosphate group for a ligation reaction.

Also note that the 3-prime side of the short adapter contains an amine group to prevent the adapters from forming concatamers.

Two gene specific primers (GSP1 and GSP2) were designed for each region of known sequence (i.e., the exons of the INGAP gene). See FIG. 6 for fragment location and GSP1 and GSP2 location. The gene specific primers were designed as reverse PCR primers for all fragments except fragments 1_2 and 14_5. The gene specific primers for fragments 1_2 and 14_5 were designed as forward primers. Adapter primer 1 (AP1) and adapter primer 2 (AP2) (FIG. 5) were forward PCR primers for all fragments except fragments 1_2 and 14_5, which were reverse PCR primers. The outer gene specific primer (GSP1) was used with adapter primer 1 in a PCR reaction. To increase specificity, a second, nested PCR was set up using the inner gene specific primer (GSP2) and adapter primer 2. A small aliquot of the first reaction served as template for the second reaction. Gene specific PCR primers utilized for gene walking are listed in Table 2 and the strategy used to build the INGAP genomic sequence is shown in FIGS. 6 and 7. The arrowheads in FIG. 6 represent the adapter primers (AP1 and AP2), while the circles represent the gene specific primers (GSP1 and GSP2).

TABLE 2 NAME (LOCATION) SEQUENCE INGEN 21_3 (1464, 1482) 5′-ACAAGCAATCTAGAGATGG-3′ (SEQ ID NO: 3) INGEN 19_3 (1401, 1423) 5′-GTTCAGCTATGTTCATAGCAGGG-3′ (SEQ ID NO: 4) INGEN 16_3 (1855, 1876) 5′-GTCTGTATGACTGTGTGGGAAG-3′ (SEQ ID NO: 5) INGEN 15_3 (1929, 1948) 5′-GCACTTGAACTCAATGGCTC-3′ (SEQ ID NO: 6) INGEN 14_3 (2147, 2168) 5′-GAACCACCTGACATGGGTGATG-3′ (SEQ ID NO: 7) INGEN 13_3 (2177, 2200) 5′-GGGCATCGTATCATCTGGTTACAG-3′ (SEQ ID NO: 8) INGEN 8_3 (2544, 2565) 5′-GGTTCAAAAAAGCTGCTTCAAC-3′ (SEQ ID NO: 9) INGEN 7_3 (2666, 2689) 5′-GGAATAGCTGCAATTTATGCCCAT-3′ (SEQ ID NO: 10) INGEN 4_3 (2833, 2858) 5′-CTTAGGAACATTCAGGCAGCCTCCTG-3′ (SEQ ID NO: 11) INGEN 3_3 (2866, 2891) 5′-GTTGCCCTCTGCCACGTGTCAAGTTC-3′ (SEQ ID NO: 12) INGEN 2_3 (3444, 3470) 5′-CATCCAAGACATCCTACAGAGGGTCAT-3′ (SEQ ID NO: 13) INGEN 1_3 (3475, 3501) 5′-CCCAAGAAAGGAACATCAGGCAGGAAA-3′ (SEQ ID NO: 14) INGEN 2_2 (3330, 3350) 5′-CCAAATGAGTGCTTCCCTGAA-3′ (SEQ ID NO: 15) INGEN 1_2 (3241, 3266) 5′-GCAGCACTCTGAAACTCAGTAGAGTT-3′ (SEQ ID NO: 16) INGEN 14_5 (5544, 5563) 5′-GCTGCTGACCGTGGTTATTG-3′ (SEQ ID NO: 17) INGEN 13_5 (5463, 5485) 5′-ACACTACCCAACGGAAGTGGATG-3′ (SEQ ID NO: 18) INGAP1_1L (3475, 3492) 5′-TTTCCTGCCTGATGTTCC-3′ (SEQ ID NO: 19) INGAP1_1R (5957, 5976) 5′-TCATACTTGCTTCCTTGTCC-3′ (SEQ ID NO: 20) INGAP2_1L (4470, 4488) 5′-CTTCACGTATAACCTGTCC-3′ (SEQ ID NO: 21) INGAP2_1R (5905, 5923) 5′-ATTAGAACTGCCCTAGACC-3′ (SEQ ID NO: 22)

The PCR fragments were sequenced to determine the nucleotide sequence of the INGAP 5′-regulatory region, the introns, the intron/exon junctions, and the 3-prime polyadenylation regions. The nucleotide sequence of hamster INGAP genomic DNA is shown in SEQ ID NO: 2.

Example 2 Cloning Hamster INGAP 5′-Regulatory Region Fragment into a Reporter Construct

To construct the INGAP 5′-regulatory region, individual PCR fragments were joined together at unique restriction sites located within two adjoining fragments. FIGS. 6 and 7 detail the strategy used to piece the INGAP 5′-regulatory region together. Fragments 8_3 and 2_3 were joined at a unique SphI site; 14_3 and 8_3 were joined at a unique BbsI site; 16_3 and 14_3 were joined at a unique PstI site. The nucleotide sequence of hamster INGAP 5′-regulatory region DNA is shown in SEQ ID NO: 1 and 23 in the sequence listing.

The hamster INGAP 5′-regulatory region or a fragment of the 5′-regulatory region was cloned into a reporter plasmid, pβGal-Basic (Clontech). The 5′-regulatory region or fragments were cloned utilizing the unique XmaI site from the gene walking adapter primer and a unique BgIII site located at the 3-prime side of the regulatory region. FIG. 8 details the fragments cloned into pβGal-Basic. The sizes of the fragments are indicated to the right of the fragments and are expressed as the number of nucleotides of the fragment.

Example 3 Assay System to Screen for Factors that Modulate the Expression of INGAP

Promoter analysis of INGAP identified a number of potential promoter-proximal regulatory sites including the consensus transcription factor binding sites; cAMP response element (CRE), AP-1 and STAT. Promoter-fragment reporter-gene constructs were transiently transfected into 293T cells and co-transfection of secretory alkaline phosphatase was used to normalize for transfection efficiency.

Reporter constructs containing INGAP 5′-regulatory region fragments 2_3 sP (SEQ ID NO: 37), 2_3 dP (SEQ ID NO: 38), 2_3 pP (SEQ ID NO: 36), 14_3P (SEQ ID NO: 34), 16_3P (SEQ ID NO: 31), or 19_3P (SEQ ID NO: 23) were transfected into human cells. The pβGal-Basic plasmid without the hamster INGAP DNA was also transfected into human cells as a control to measure the level of endogenous reporter activity. Two days following transfection, the cells were treated with PMA for 24 hours or were untreated. To determine the level of promoter activity, the amount of β-galactosidase gene product was determined using a luminescent assay for β-galactosidase. FIG. 9A shows that construct 14_3P activated the INGAP expression the most, followed by 2_3 pP, and 16_3P.

Reporter construct containing INGAP 5′-regulatory region DNA nucleotides 2030 to 3120 was transfected into human cells. The pβGal-Basic plasmid without the hamster INGAP DNA was also transfected into human cells as a control to measure the level of endogenous reporter activity. Two days following transfection, the cells were treated with LIF for 24 hours or were untreated. To determine the level of promoter activity, the amount of β-galactosidase gene product was determined using a luminescent assay for β-galactosidase. FIG. 9B shows the results. LIF was determined to increase the activity of the 5′-regulatory region of mammalian INGAP. Forskolin (an activator of cAMP/CREB/CRE) did not modulate gene expression (data not shown).

It is important to note that when present in human cells, the hamster INGAP 5′-regulatory region is transactivated by the human transcription factors. Thus, linked to a reporter gene, the 5′-regulatory region of hamster INGAP creates a sensitive assay system to screen for factors that modulate the expression of INGAP.

Example 4 Determination of Approximate Location of PMA and LIF-Mediated Transcription Factor Binding in the 5′-Regulatory Region

To map the approximate location of PMA-initiated or LIF-initiated transcription factor binding different fragments of the hamster INGAP 5′-regulatory region were cloned into pβGal-Basic. See FIG. 8. The fragments cloned into the reporter construct were 2_3 sP (SEQ ID NO: 37), 2_3 dP (SEQ ID NO: 38), 2_3 pP (SEQ ID NO: 36), 14_3P (SEQ ID NO: 34), 16_3P (SEQ ID NO: 31), or 19_3P (SEQ ID NO: 23). The reporter constructs were transfected into human cells. Two days following transfection, the cells were treated with different concentrations of PMA or LIF for 24 hours. The concentrations of PMA used were 6 ng/ml, 17 ng/ml, 50 ng/ml, 100 ng/ml, or 300 ng/ml. The concentrations of LIF used were 1 ng/ml, 10 ng/ml, or 30 ng/ml. To determine the level of promoter activity, the amount of β-galactosidase gene product was determined using a luminescent assay for β-galactosidase. FIGS. 10 and 11 show the results for PMA and LIF treatment, respectively. Both PMA and LIF activated the cell reporter constructs. The exact location of the DNA contact sites can be narrowed further by cloning smaller fragments of the hamster INGAP 5′-regulatory region and by site directed mutations or deletions.

Example 5 RNA Analysis of INGAP Gene Upregulation

To determine if INGAP RNA levels increase after stimulation with a cytokine that signals through STAT, rat amphocrine pancreatic cells, AR42J were treated with IL-6 (1000 U/ml) for 24 hours. Total RNA was extracted from the treated and untreated cells using techniques well known in the art, e.g., using TRIZOL® reagent.

Equal amounts of total RNA (10 μg) were loaded in 2.5% formaldehyde gel and electrophoresed for 4 hours at 70V with a constant circulation of the buffer using a circulating pump. The gel was photographed and washed with water twice at room temperature and soaked in 20×SSC. The gel was transferred to a nylon membrane (Amersham) in 20×SSC overnight following a standard procedure. The membrane was washed with 20×SSC to remove any agar that might have attached to the membrane and baked for 4 hours at 80° C.

One hundred nanograms of hamster INGAP cDNA was labeled using Random Prime Labeling kit (Roche-BMB) and alpha-P³² dCTP (ICN). Approximately 20 million counts were used for hybridization in 20 ml hybridization buffer following the standard procedure at 42° C. for overnight. The blot was washed as follows: 2-times at room temperature with 2×SSC for 10 minutes each; 2-times at 42° C. with 2×SSC for 10 minutes each; 2-times at 55° C. with 1×SSC for 10 minutes each. The membrane was exposed to the film (XOMAT-Kodak) and kept at −80° C. overnight before developing.

Treatment with IL-6 caused an increase in INGAP gene expression (FIG. 12). These data demonstrate that extracellular factors that elevate AP-1-binding transcription factors and STAT-binding transcription factors are involved in the regulation of INGAP gene expression. These studies suggest that it is feasible to enhance INGAP expression as a means of inducing islet neogenesis.

While particular embodiments of the present invention have been illustrated and described, it would be obvious to those skilled in the art that various other changes and modifications can be made without departing from the spirit and scope of the invention. It is therefore intended to cover in the appended claims all such changes and modifications that are within the scope of this invention.

TABLE 3 SEQ ID Further Position NO: Family/matrix Information Opt. from-to anchor SEQ V$LEFF/LEF1.01 TCF/LEF-1, 0.86 12-28 20 ID involved in the NO: Wnt signal 41 transduction pathway SEQ V$MITF/MIT.01 MIT 0.81 22-40 31 ID (microphthalmia NO: transcription 42 factor) and TFE3 SEQ V$OCT1/OCT1.05 octamer-binding 0.90 27-41 34 ID factor 1 NO: 43 SEQ V$TCFF/TCF11.01 TCF11/KCR- 1.00 32-38 35 ID F1/Nrf1 NO: homodimers 44 SEQ V$MYOF/MYOGNF1.01 Myogenin/ 0.71 25-53 39 ID nuclear factor 1 NO: or related factors 45 SEQ V$ZBPF/ZBP89.01 Zinc finger 0.93 36-48 42 ID transcription NO: factor ZBP-89 46 SEQ V$SP1F/GC.01 GC box elements 0.88 38-52 45 ID NO: 47 SEQ V$PERO/PPARA.01 PPAR/RXR 0.70 44-64 54 ID heterodimers NO: 48 SEQ V$PAX5/PAX9.01 zebrafish PAX9 0.78 43-71 57 ID binding sites NO: 49 SEQ V$TBPF/ATATA.01 Avian C-type LTR 0.81 68-84 76 ID TATA box NO: 50 SEQ V$HMTB/MTBF.01 muscle-specific 0.90 76-84 80 ID Mt binding site NO: 51 SEQ V$OCT1/OCT1.06 octamer-binding 0.80 74-88 81 ID factor 1 NO: 52 NO: (en-1) 65 SEQ V$BARB/BARBIE.01 barbiturate- 0.88 230-244 237 ID inducible element NO: 66 SEQ V$TBPF/TATA.01 cellular and viral 0.90 230-246 238 ID TATA box NO: elements 67 SEQ V$BARB/BARBIE.01 barbiturate- 0.88 252-266 259 ID inducible element NO: 68 SEQ V$MYT1/MYT1.01 MyT1 zinc finger 0.75 272-284 278 ID transcription NO: factor involved in 69 primary neurogenesis SEQ V$SMAD/SMAD4.01 Smad4 0.94 304-312 308 ID transcription NO: factor involved in 70 TGF-beta signaling SEQ V$HOXF/CRX.01 Cone-rod 0.94 312-328 320 ID homeobox- NO: containing 71 transcription factor/otx-like homeobox gene SEQ V$ECAT/NFY.01 nuclear factor Y 0.90 337-351 344 ID (Y-box binding NO: factor) 72 SEQ V$HOXF/PTX1.01 Pituitary 0.79 337-353 345 ID Homeobox 1 NO: (Ptx1) 73 SEQ V$FKHD/FREAC2.01 Fork head 0.84 362-378 370 ID RElated NO: ACtivator-2 74 SEQ V$MINI/MUSCLE_INI.02 Muscle Initiator 0.86 401-419 410 ID Sequence NO: 75 SEQ V$MOKF/MOK2.01 Ribonucleoprotein 0.74 409-429 419 ID associated zinc NO: finger protein 76 MOK-2 (mouse) SEQ V$ZFIA/ZID.01 zinc finger with 0.85 414-426 420 ID interaction NO: domain 77 SEQ V$CART/XVENT2.01 Xenopus 0.82 418-434 426 ID homeodomain NO: factor Xvent-2; 78 early BMP signaling response SEQ V$OCT1/OCT1.04 octamer-binding 0.80 421-435 428 ID factor 1 NO: 79 SEQ V$HOMS/S8.01 Binding site for 0.97 426-434 430 ID S8 type NO: homeodomains 80 SEQ V$NKXH/NKX25.02 homeo domain 0.88 424-436 430 ID factor Nkx- NO: 2.5/Csx, tinman 81 homolog low affinity sites SEQ V$CREB/CREBP1.01 cAMP-responsive 0.80 425-445 435 ID element binding NO: protein 1 82 SEQ V$COMP/COMP1.01 COMP1, 0.76 434-454 444 ID cooperates with NO: myogenic 83 proteins in multicomponent complex SEQ V$HOXF/HOX1-3.01 Hox-1.3, 0.83 444-460 452 ID vertebrate NO: homeobox 84 protein SEQ V$ETSF/GABP.01 GABP: GA 0.85 454-470 462 ID binding protein NO: 85 SEQ V$LEFF/LEF1.01 TCF/LEF-1, 0.86 463-479 471 ID involved in the NO: Wnt signal 86 transduction pathway SEQ V$STAT/STAT6.01 STAT6: signal 0.84 464-482 473 ID transducer and NO: activator of 87 transcription 6 SEQ V$GATA/GATA1.03 GATA-binding 0.95 490-502 496 ID factor 1 NO: 88 SEQ V$SRFF/SRF.01 serum response 0.66 487-505 496 ID factor NO: 89 SEQ V$EVI1/EVI1.04 Ecotropic viral 0.77 493-509 501 ID integration site 1 NO: encoded factor 90 SEQ V$AP4R/TH1E47.01 Thing1/E47 0.93 509-525 517 ID heterodimer, TH1 NO: bHLH member 91 specific expression in a variety of embryonic tissues SEQ V$AP4R/TAL1BETAITF2.01 Tal-1beta/ITF-2 0.85 512-528 520 ID heterodimer NO: 92 SEQ V$NEUR/NEUROD1.01 DNA binding site 0.83 514-526 520 ID for NEUROD1 NO: (BETA-2/E47 93 dimer) SEQ V$MEF2/MEF2.05 MEF2 0.96 518-540 529 ID NO: 94 SEQ V$EVI1/EVI1.04 Ecotropic viral 0.77 523-539 531 ID integration site 1 NO: encoded factor 95 SEQ V$MEF2/AMEF2.01 myocyte 0.80 521-543 532 ID enhancer factor NO: 96 SEQ V$TBPF/MTATA.01 Muscle TATA box 0.84 524-540 532 ID NO: 97 SEQ V$HOXF/HOX1-3.01 Hox-1.3, 0.83 543-559 551 ID vertebrate NO: homeobox 98 protein SEQ V$PDX1/ISL1.01 Pancreatic and 0.82 543-563 553 ID intestinal lim- NO: homeodomain 99 factor SEQ V$OCT1/OCT1.05 octamer-binding 0.90 556-570 563 ID factor 1 NO: 100 SEQ V$CIZF/NMP4.01 NMP4 (nuclear 0.97 562-572 567 ID matrix protein 4)/ NO: CIZ (Cas- 101 interacting zinc finger protein) SEQ V$EVI1/EVI1.01 Ecotropic viral 0.72 569-585 577 ID integration site 1 NO: encoded factor 102 SEQ V$AP1F/AP1.01 AP1 binding site 0.95 582-602 592 ID NO: 103 SEQ V$PIT1/PIT1.01 Pit1, GHF-1 0.86 589-599 594 ID pituitary specific NO: pou domain 104 transcription factor SEQ V$AP1F/AP1.01 AP1 binding site 0.95 586-606 596 ID NO: 105 SEQ V$VMYB/VMYB.01 v-Myb 0.90 593-603 598 ID NO: 106 SEQ V$CIZF/NMP4.01 NMP4 (nuclear 0.97 595-605 600 ID matrix protein 4)/ NO: CIZ (Cas- 107 interacting zinc finger protein) SEQ V$GREF/PRE.01 Progesterone 0.84 604-622 613 ID receptor binding NO: site 108 SEQ V$GKLF/GKLF.01 Gut-enriched 0.91 632-646 639 ID Krueppel-like NO: factor 109 SEQ V$CIZF/NMP4.01 NMP4 (nuclear 0.97 637-647 642 ID matrix protein 4)/ NO: CIZ (Cas- 110 interacting zinc finger protein) SEQ V$NFAT/NFAT.01 Nuclear factor of 0.97 640-650 645 ID activated T-cells NO: 111 SEQ V$MAZF/MAZ.01 Myc associated 0.90 649-661 655 ID zinc finger NO: protein (MAZ) 112 SEQ V$EGRF/WT1.01 Wilms Tumor 0.88 658-672 665 ID Suppressor NO: 113 SEQ V$ZBPF/ZBP89.01 Zinc finger 0.93 663-675 669 ID transcription NO: factor ZBP-89 114 SEQ V$IRFF/IRF2.01 interferon 0.80 702-716 709 ID regulatory factor 2 NO: 115 SEQ V$BRNF/BRN2.01 POU factor Brn-2 0.91 746-762 754 ID (N-Oct 3) NO: 116 SEQ V$ETSF/PU1.01 Pu.1 (Pu120) Ets- 0.86 746-762 754 ID like transcription NO: factor identified 117 in lymphoid B- cells SEQ V$EVI1/EVI1.04 Ecotropic viral 0.77 750-766 758 ID integration site 1 NO: encoded factor 118 SEQ V$EVI1/EVI1.05 Ecotropic viral 0.80 755-771 763 ID integration site 1 NO: encoded factor 119 SEQ V$ZBPF/ZBP89.01 Zinc finger 0.93 764-776 770 ID transcription NO: factor ZBP-89 120 SEQ V$FAST/FAST1.01 FAST-1 SMAD 0.81 769-783 776 ID interacting NO: protein 121 SEQ V$TBPF/TATA.02 Mammalian C- 0.89 771-787 779 ID type LTR TATA NO: box 122 SEQ V$PAX5/PAX9.01 zebrafish PAX9 0.78 781-809 795 ID binding sites NO: 123 SEQ V$OCT1/OCT.01 Octamer binding 0.79 793-807 800 ID site (OCT1/OCT2 NO: consensus) 124 SEQ V$OCTP/OCT1P.01 octamer-binding 0.86 798-810 804 ID factor 1, POU- NO: specific domain 125 SEQ V$SRFF/SRF.01 serum response 0.66 797-815 806 ID factor NO: 126 SEQ V$EVI1/EVI1.05 Ecotropic viral 0.80 802-818 810 ID integration site 1 NO: encoded factor 127 SEQ V$CLOX/CDP.01 cut-like 0.75 803-819 811 ID homeodomain NO: protein 128 SEQ V$EVI1/EVI1.02 Ecotropic viral 0.83 807-823 815 ID integration site 1 NO: encoded factor 129 SEQ V$ECAT/NFY.02 nuclear factor Y 0.91 810-824 817 ID (Y-box binding NO: factor) 130 SEQ V$HAML/AML3.01 Runt-related 0.84 811-825 818 ID transcription NO: factor 2/CBFA1 131 (core-binding factor, runt domain, alpha subunit 1) SEQ V$PCAT/CAAT.01 cellular and viral 0.90 813-823 818 ID CCAAT box NO: 132 SEQ V$GATA/GATA.01 GATA binding site 0.95 818-830 824 ID (consensus) NO: 133 SEQ V$HNF1/HNF1.02 Hepatic nuclear 0.76 818-834 826 ID factor 1 NO: 134 SEQ V$HOXT/MEIS1_HOXA9.01 Homeobox 0.79 823-835 829 ID protein MEIS1 NO: binding site 135 SEQ V$ECAT/NFY.01 nuclear factor Y 0.90 837-851 844 ID (Y-box binding NO: factor) 136 SEQ V$FKHD/FREAC2.01 Fork head 0.84 844-860 852 ID RElated NO: ACtivator-2 137 SEQ V$EVI1/EVI1.06 Ecotropic viral 0.83 846-862 854 ID integration site 1 NO: encoded factor 138 SEQ V$GATA/GATA1.01 GATA-binding 0.96 853-865 859 ID factor 1 NO: 139 SEQ V$PCAT/ACAAT.01 Avian C-type LTR 0.86 856-866 861 ID CCAAT box NO: 140 SEQ V$XBBF/RFX1.01 X-box binding 0.89 909-927 918 ID protein RFX1 NO: 141 SEQ V$EBOX/MYCMAX.02 c-Myc/Max 0.92 912-928 920 ID heterodimer NO: 142 SEQ V$MITF/MIT.01 MIT 0.81 911-929 920 ID (microphthalmia NO: transcription 143 factor) and TFE3 SEQ V$ETSF/PU1.01 Pu.1 (Pu120) Ets- 0.86 927-943 935 ID like transcription NO: factor identified 144 in lymphoid B- cells SEQ V$OCT1/OCT1.06 octamer-binding 0.80 932-946 939 ID factor 1 NO: 145 SEQ V$TALE/TGIF.01 TG-interacting 1.00 936-942 939 ID factor belonging NO: to TALE class of 146 homeodomain factors SEQ V$MITF/MIT.01 MIT 0.81 935-953 944 ID (microphthalmia NO: transcription 147 factor) and TFE3 SEQ V$OCT1/OCT1.04 octamer-binding 0.80 941-955 948 ID factor 1 NO: 148 SEQ V$GATA/GATA.01 GATA binding site 0.95 962-974 968 ID (consensus) NO: 149 SEQ V$SRFF/SRF.01 serum response 0.66 968-986 977 ID factor NO: 150 SEQ V$CDXF/CDX2.01 Cdx-2 0.84 970-988 979 ID mammalian NO: caudal related 151 intestinal transcr. factor SEQ V$FKHD/XFD2.01 Xenopus fork 0.89 972-988 980 ID head domain NO: factor 2 152 SEQ V$MEF2/MEF2.01 myogenic 0.74 970-992 981 ID enhancer factor 2 NO: 153 SEQ V$TBPF/TATA.01 cellular and viral 0.90 973-989 981 ID TATA box NO: elements 154 SEQ V$CART/CART1.01 Cart-1 (cartilage 0.84 978-994 986 ID homeoprotein 1) NO: 155 SEQ V$CART/CART1.01 Cart-1 (cartilage 0.84  985-1001 993 ID homeoprotein 1) NO: 156 SEQ V$SATB/SATB1.01 Special AT-rich 0.93  985-1001 993 ID sequence-binding NO: protein 1, 157 predominantly expressed in thymocytes, binds to matrix attachment regions (MARs) SEQ V$BRNF/BRN3.01 POU transcription 0.78  987-1003 995 ID factor Brn-3 NO: 158 SEQ V$CLOX/CDP.01 cut-like 0.75  987-1003 995 ID homeodomain NO: protein 159 SEQ V$HOMS/S8.01 Binding site for 0.97  992-1000 996 ID S8 type NO: homeodomains 160 SEQ V$NKXH/DLX1.01 DLX-1, -2, and -5 0.91  990-1002 996 ID binding sites NO: 161 SEQ V$HOXF/HOX1-3.01 Hox-1.3, 0.83  989-1005 997 ID vertebrate NO: homeobox 162 protein SEQ V$PDX1/PDX1.01 Pdx1 (IDX1/IPF1) 0.74  988-1008 998 ID pancreatic and NO: intestinal 163 homeodomain TF SEQ V$FKHD/XFD3.01 Xenopus fork 0.82  998-1014 1006 ID head domain NO: factor 3 164 SEQ V$HNF1/HNF1.01 hepatic nuclear 0.78 1000-1016 1008 ID factor 1 NO: 165 SEQ V$HNF1/HNF1.01 hepatic nuclear 0.78 1002-1018 1010 ID factor 1 NO: 166 SEQ V$PAX4/PAX4.01 Pax-4 paired 0.97 1005-1015 1010 ID domain protein, NO: together with 167 PAX-6 involved in pancreatic development SEQ V$HOMS/S8.01 Binding site for 0.97 1007-1015 1011 ID S8 type NO: homeodomains 168 SEQ V$HOXF/HOX1-3.01 Hox-1.3, 0.83 1003-1019 1011 ID vertebrate NO: homeobox 169 protein SEQ V$NKXH/DLX1.01 DLX-1, -2, and -5 0.91 1005-1017 1011 ID binding sites NO: 170 SEQ V$RBIT/BRIGHT.01 Bright, B cell 0.92 1005-1017 1011 ID regulator of IgH NO: transcription 171 SEQ V$TBPF/ATATA.01 Avian C-type LTR 0.81 1005-1021 1013 ID TATA box NO: 172 SEQ V$CREB/CREBP1.01 cAMP-responsive 0.80 1004-1024 1014 ID element binding NO: protein 1 173 SEQ V$RORA/RORA2.01 RAR-related 0.82 1007-1023 1015 ID orphan receptor NO: alpha2 174 SEQ V$PCAT/CAAT.01 cellular and viral 0.90 1022-1032 1027 ID CCAAT box NO: 175 SEQ V$NKXH/NKX25.02 homeo domain 0.88 1022-1034 1028 ID factor Nkx- NO: 2.5/Csx, tinman 176 homolog low affinity sites SEQ V$CREB/HLF.01 hepatic leukemia 0.84 1022-1042 1032 ID factor NO: 177 SEQ V$HOXF/HOX1-3.01 Hox-1.3, 0.83 1056-1072 1064 ID vertebrate NO: homeobox 178 protein SEQ V$HOMS/S8.01 Binding site for 0.97 1061-1069 1065 ID S8 type NO: homeodomains 179 SEQ V$NKXH/DLX1.01 DLX-1, -2, and -5 0.91 1059-1071 1065 ID binding sites NO: 180 SEQ V$RBIT/BRIGHT.01 Bright, B cell 0.92 1059-1071 1065 ID regulator of IgH NO: transcription 181 SEQ V$BRNF/BRN2.01 POU factor Brn-2 0.91 1058-1074 1066 ID (N-Oct 3) NO: 182 SEQ V$OCT1/OCT1.06 octamer-binding 0.80 1060-1074 1067 ID factor 1 NO: 183 SEQ V$HOXF/HOX1-3.01 Hox-1.3, 0.83 1061-1077 1069 ID vertebrate NO: homeobox 184 protein SEQ V$OCT1/OCT1.06 octamer-binding 0.80 1079-1093 1086 ID factor 1 NO: 185 SEQ V$FAST/FAST1.01 FAST-1 SMAD 0.81 1080-1094 1087 ID interacting NO: protein 186 SEQ V$RREB/RREB1.01 Ras-responsive 0.79 1081-1095 1088 ID element binding NO: protein 1 187 SEQ V$E2FF/E2F.02 E2F, involved in 0.84 1085-1099 1092 ID cell cycle NO: regulation, 188 interacts with Rb p107 protein SEQ V$CREB/TAXCREB.01 Tax/CREB 0.81 1091-1111 1101 ID complex NO: 189 SEQ V$AP1F/VMAF.01 v-Maf 0.82 1092-1112 1102 ID NO: 190 SEQ V$MYT1/MYT1.01 MyT1 zinc finger 0.75 1123-1135 1129 ID transcription NO: factor involved in 191 primary neurogenesis SEQ V$CLOX/CLOX.01 Clox 0.81 1136-1152 1144 ID NO: 192 SEQ V$HNF4/HNF4.01 Hepatic nuclear 0.82 1156-1172 1164 ID factor 4 NO: 193 SEQ V$LEFF/LEF1.01 TCF/LEF-1, 0.86 1157-1173 1165 ID involved in the NO: Wnt signal 194 transduction pathway SEQ V$PERO/PPARA.01 PPAR/RXR 0.70 1157-1177 1167 ID heterodimers NO: 195 SEQ V$CLOX/CLOX.01 Clox 0.81 1173-1189 1181 ID NO: 196 SEQ V$HNF6/HNF6.01 Liver enriched 0.82 1175-1189 1182 ID Cut - NO: Homeodomain 197 transcription factor HNF6 (ONECUT) SEQ V$SRFF/SRF.02 serum response 0.83 1177-1195 1186 ID factor NO: 198 SEQ V$CLOX/CDPCR3.01 cut-like 0.75 1180-1196 1188 ID homeodomain NO: protein 199 SEQ V$PIT1/PIT1.01 Pit1, GHF-1 0.86 1186-1196 1191 ID pituitary specific NO: pou domain 200 transcription factor SEQ V$HMTB/MTBF.01 muscle-specific 0.90 1196-1204 1200 ID Mt binding site NO: 201 SEQ V$FKHD/HFH8.01 HNF-3/Fkh 0.92 1200-1216 1208 ID Homolog-8 NO: 202 SEQ V$E4FF/E4F.01 GLI-Krueppel- 0.82 1223-1235 1229 ID related NO: transcription 203 factor, regulator of adenovirus E4 promoter SEQ V$CREB/HLF.01 hepatic leukemia 0.84 1221-1241 1231 ID factor NO: 204 SEQ V$VBPF/VBP.01 PAR-type chicken 0.86 1226-1236 1231 ID vitellogenin NO: promoter-binding 205 protein SEQ V$OCT1/OCT.01 Octamer binding 0.79 1259-1273 1266 ID site (OCT1/OCT2 NO: consensus) 206 SEQ V$STAT/STAT6.01 STAT6: signal 0.84 1261-1279 1270 ID transducer and NO: activator of 207 transcription 6 SEQ V$CDXF/CDX2.01 Cdx-2 0.84 1270-1288 1279 ID mammalian NO: caudal related 208 intestinal transcr. factor SEQ V$SORY/SOX9.01 SOX (SRY-related 0.90 1280-1296 1288 ID HMG box) NO: 209 SEQ V$FKHD/HFH2.01 HNF-3/Fkh 0.93 1285-1301 1293 ID Homolog 2 NO: 210 SEQ V$CDXF/CDX2.01 Cdx-2 0.84 1286-1304 1295 ID mammalian NO: caudal related 211 intestinal transcr. factor SEQ V$OCTB/TST1.01 POU-factor Tst- 0.87 1288-1302 1295 ID 1/Oct-6 NO: 212 SEQ V$PDX1/ISL1.01 Pancreatic and 0.82 1298-1318 1308 ID intestinal lim- NO: homeodomain 213 factor SEQ V$SORY/SOX9.01 SOX (SRY-related 0.90 1308-1324 1316 ID HMG box) NO: 214 SEQ V$CREB/HLF.01 hepatic leukemia 0.84 1310-1330 1320 ID factor NO: 215 SEQ V$VBPF/VBP.01 PAR-type chicken 0.86 1315-1325 1320 ID vitellogenin NO: promoter-binding 216 protein SEQ V$CEBP/CEBPB.01 CCAAT/enhancer 0.94 1313-1331 1322 ID binding protein NO: beta 217 SEQ V$PDX1/ISL1.01 Pancreatic and 0.82 1313-1333 1323 ID intestinal lim- NO: homeodomain 218 factor SEQ V$HAML/AML1.01 runt-factor AML-1 1.00 1323-1337 1330 ID NO: 219 SEQ V$GREF/ARE.01 Androgene 0.80 1323-1341 1332 ID receptor binding NO: site 220 SEQ V$TEAF/TEF1.01 TEF-1 related 0.84 1343-1355 1349 ID muscle factor NO: 221 SEQ V$CMYB/CMYB.01 c-Myb, important 0.99 1352-1360 1356 ID in hematopoesis, NO: cellular 222 equivalent to avian myoblastosis virus oncogene v- myb SEQ V$AP4R/TH1E47.01 Thing1/E47 0.93 1378-1394 1386 ID heterodimer, TH1 NO: bHLH member 223 specific expression in a variety of embryonic tissues SEQ V$CP2F/CP2.01 CP2 0.90 1384-1394 1389 ID NO: 224 SEQ V$CHOP/CHOP.01 heterodimers of 0.90 1386-1398 1392 ID CHOP and NO: C/EBPalpha 225 SEQ V$CEBP/CEBP.02 C/EBP binding 0.85 1385-1403 1394 ID site NO: 226 SEQ V$MEF2/HMEF2.01 myocyte 0.76 1384-1406 1395 ID enhancer factor NO: 227 SEQ V$OCT1/OCT1.03 octamer-binding 0.85 1388-1402 1395 ID factor 1 NO: 228 SEQ V$HMTB/MTBF.01 muscle-specific 0.90 1394-1402 1398 ID Mt binding site NO: 229 SEQ V$CLOX/CDPCR3.01 cut-like 0.75 1422-1438 1430 ID homeodomain NO: protein 230 SEQ V$OCT1/OCT1.05 octamer-binding 0.90 1423-1437 1430 ID factor 1 NO: 231 SEQ V$HOXF/HOX1-3.01 Hox-1.3, 0.83 1423-1439 1431 ID vertebrate NO: homeobox 232 protein SEQ V$PDX1/PDX1.01 Pdx1 (IDX1/IPF1) 0.74 1423-1443 1433 ID pancreatic and NO: intestinal 233 homeodomain TF SEQ V$SORY/SOX5.01 Sox-5 0.87 1426-1442 1434 ID NO: 234 SEQ V$OCT1/OCT1.05 octamer-binding 0.90 1444-1458 1451 ID factor 1 NO: 235 SEQ V$CREB/E4BP4.01 E4BP4, bZIP 0.80 1443-1463 1453 ID domain, NO: transcriptional 236 repressor SEQ V$VBPF/VBP.01 PAR-type chicken 0.86 1449-1459 1454 ID vitellogenin NO: promoter-binding 237 protein SEQ V$TBPF/MTATA.01 Muscle TATA box 0.84 1455-1471 1463 ID NO: 238 SEQ V$PBXF/PBX1.01 homeo domain 0.78 1469-1481 1475 ID factor Pbx-1 NO: 239 SEQ V$COMP/COMP1.01 COMP1, 0.76 1467-1487 1477 ID cooperates with NO: myogenic 240 proteins in multicomponent complex SEQ V$SORY/S0X5.01 Sox-5 0.87 1478-1494 1486 ID NO: 241 SEQ V$FKHD/FREAC2.01 Fork head 0.84 1485-1501 1493 ID RElated NO: ACtivator-2 242 SEQ V$PDX1/ISL1.01 Pancreatic and 0.82 1495-1515 1505 ID intestinal lim- NO: homeodomain 243 factor SEQ V$HOXF/HOX1-3.01 Hox-1.3, 0.83 1499-1515 1507 ID vertebrate NO: homeobox 244 protein SEQ V$PDX1/PDX1.01 Pdx1 (IDX1/IPF1) 0.74 1498-1518 1508 ID pancreatic and NO: intestinal 245 homeodomain TF SEQ V$CART/XVENT2.01 Xenopus 0.82 1502-1518 1510 ID homeodomain NO: factor Xvent-2; 246 early BMP signaling response SEQ V$CDXF/CDX2.01 Cdx-2 0.84 1507-1525 1516 ID mammalian NO: caudal related 247 intestinal transcr. factor SEQ V$MEF2/MEF2.05 MEF2 0.96 1505-1527 1516 ID NO: 248 SEQ V$HNF1/HNF1.01 hepatic nuclear 0.78 1510-1526 1518 ID factor 1 NO: 249 SEQ V$OCT1/OCT1.06 octamer-binding 0.80 1511-1525 1518 ID factor 1 NO: 250 SEQ V$TBPF/TATA.02 Mammalian C- 0.89 1510-1526 1518 ID type LTR TATA NO: box 251 SEQ V$NKXH/MSX.01 Homeodomain 0.97 1514-1526 1520 ID proteins MSX-1 NO: and MSX-2 252 SEQ V$RBIT/BRIGHT.01 Bright, B cell 0.92 1515-1527 1521 ID regulator of IgH NO: transcription 253 SEQ V$MEF2/AMEF2.01 myocyte 0.80 1514-1536 1525 ID enhancer factor NO: 254 SEQ V$EVI1/EVI1.02 Ecotropic viral 0.83 1526-1542 1534 ID integration site 1 NO: encoded factor 255 SEQ V$GATA/GATA1.02 GATA-binding 0.99 1528-1540 1534 ID factor 1 NO: 256 SEQ V$GATA/GATA3.02 GATA-binding 0.91 1537-1549 1543 ID factor 3 NO: 257 SEQ V$GATA/GATA3.02 GATA-binding 0.91 1559-1571 1565 ID factor 3 NO: 258 SEQ V$OCT1/OCT1.02 octamer-binding 0.82 1561-1575 1568 ID factor 1 NO: 259 SEQ V$CEBP/CEBPB.01 CCAAT/enhancer 0.94 1567-1585 1576 ID binding protein NO: beta 260 SEQ V$PLZF/PLZF.01 Promyelocytic 0.86 1574-1588 1581 ID leukemia zink NO: finger (TF with 261 nine Krueppel- like zink fingers) SEQ V$PAX3/PAX3.01 Pax-3 paired 0.76 1587-1599 1593 ID domain protein, NO: expressed in 262 embryogenesis, mutations correlate to Waardenburg Syndrome SEQ V$CREB/ATF.01 activating 0.90 1588-1608 1598 ID transcription NO: factor 263 SEQ V$AP4R/TH1E47.01 Thing1/E47 0.93 1614-1630 1622 ID heterodimer, TH1 NO: bHLH member 264 specific expression in a variety of embryonic tissues SEQ V$NKXH/MSX.01 Homeodomain 0.97 1619-1631 1625 ID proteins MSX-1 NO: and MSX-2 265 SEQ V$RBIT/BRIGHT.01 Bright, B cell 0.92 1620-1632 1626 ID regulator of IgH NO: transcription 266 SEQ V$OCTB/TST1.01 POU-factor Tst- 0.87 1620-1634 1627 ID 1/Oct-6 NO: 267 SEQ V$NKXH/DLX3.01 Distal-less 3 0.91 1628-1640 1634 ID homeodomain NO: transcription 268 factor SEQ V$GREF/PRE.01 Progesterone 0.84 1628-1646 1637 ID receptor binding NO: site 269 SEQ V$TBPF/TATA.01 cellular and viral 0.90 1636-1652 1644 ID TATA box NO: elements 270 SEQ V$FKHD/XFD2.01 Xenopus fork 0.89 1637-1653 1645 ID head domain NO: factor 2 271 SEQ V$TBPF/TATA.01 cellular and viral 0.90 1638-1654 1646 ID TATA box NO: elements 272 SEQ V$CREB/E4BP4.01 E4BP4, bZIP 0.80 1638-1658 1648 ID domain, NO: transcriptional 273 repressor SEQ V$PDX1/ISL1.01 Pancreatic and 0.82 1644-1664 1654 ID intestinal lim- NO: homeodomain 274 factor SEQ V$COMP/COMP1.01 COMP1, 0.76 1648-1668 1658 ID cooperates with NO: myogenic 275 proteins in multicomponent complex SEQ V$TBPF/TATA.02 Mammalian C- 0.89 1658-1674 1666 ID type LTR TATA NO: box 276 SEQ V$IRFF/ISRE.01 interferon- 0.81 1662-1676 1669 ID stimulated NO: response element 277 SEQ V$XBBF/RFX1.01 X-box binding 0.89 1660-1678 1669 ID protein RFX1 NO: 278 SEQ V$MYT1/MYT1.02 MyT1 zinc finger 0.88 1667-1679 1673 ID transcription NO: factor involved in 279 primary neurogenesis SEQ V$OCT1/OCT1.06 octamer-binding 0.80 1683-1697 1690 ID factor 1 NO: 280 SEQ V$AP1F/TCF11MAFG.01 TCF11/MafG 0.81 1681-1701 1691 ID heterodimers, NO: binding to 281 subclass of AP1 sites SEQ V$NKXH/MSX2.01 Muscle segment 0.95 1687-1699 1693 ID homeo box 2, NO: homologue of 282 Drosophila (HOX 8) SEQ V$FAST/FAST1.01 FAST-1 SMAD 0.81 1687-1701 1694 ID interacting NO: protein 283 SEQ V$PBXC/PBX1_MEIS1.03 Binding site for a 0.76 1686-1702 1694 ID Pbx1/Meis1 NO: heterodimer 284 SEQ V$CIZF/NMP4.01 NMP4 (nuclear 0.97 1699-1709 1704 ID matrix protein 4)/ NO: CIZ (Cas- 285 interacting zinc finger protein) SEQ V$STAT/STAT6.01 STAT6: signal 0.84 1702-1720 1711 ID transducer and NO: activator of 286 transcription 6 SEQ V$AP4R/TAL1BETAE47.01 Tal-1beta/E47 0.87 1710-1726 1718 ID heterodimer NO: 287 SEQ V$SORY/HMGIY.01 HMGI(Y) high- 0.92 1720-1736 1728 ID mobility-group NO: protein I (Y), 288 architectural transcription factor organizing the framework of a nuclear protein- DNA transcriptional complex SEQ V$MYT1/MYT1.01 MyT1 zinc finger 0.75 1723-1735 1729 ID transcription NO: factor involved in 289 primary neurogenesis SEQ V$SRFF/SRF.01 serum response 0.66 1728-1746 1737 ID factor NO: 290 SEQ V$HOXF/HOXA9.01 Member of the 0.87 1731-1747 1739 ID vertebrate HOX - NO: cluster of 291 homeobox factors SEQ V$HOXT/MEIS1_HOXA9.01 Homeobox 0.79 1734-1746 1740 ID protein MEIS1 NO: binding site 292 SEQ V$PIT1/PIT1.01 Pit1, GHF-1 0.86 1737-1747 1742 ID pituitary specific NO: pou domain 293 transcription factor SEQ V$AP1F/AP1.01 AP1 binding site 0.95 1734-1754 1744 ID NO: 294 SEQ V$VBPF/VBP.01 PAR-type chicken 0.86 1746-1756 1751 ID vitellogenin NO: promoter-binding 295 protein SEQ V$FAST/FAST1.01 FAST-1 SMAD 0.81 1757-1771 1764 ID interacting NO: protein 296 SEQ V$HOXF/EN1.01 Homeobox 0.77 1759-1775 1767 ID protein engrailed NO: (en-1) 297 SEQ V$TBPF/MTATA.01 Muscle TATA box 0.84 1763-1779 1771 ID NO: 298 SEQ V$ETSF/ETS2.01 c-Ets-2 binding 0.86 1774-1790 1782 ID site NO: 299 SEQ V$MYT1/MYT1.02 MyT1 zinc finger 0.88 1780-1792 1786 ID transcription NO: factor involved in 300 primary neurogenesis SEQ V$GFI1/GFI1.01 Growth factor 0.97 1782-1796 1789 ID independence 1 NO: zinc finger 301 protein acts as transcriptional repressor SEQ V$TBPF/TATA.01 cellular and viral 0.90 1784-1800 1792 ID TATA box NO: elements 302 SEQ V$BRNF/BRN2.01 POU factor Brn-2 0.91 1786-1802 1794 ID (N-Oct 3) NO: 303 SEQ V$HOXT/MEIS1_HOXA9.01 Homeobox 0.79 1788-1800 1794 ID protein MEIS1 NO: binding site 304 SEQ V$MEF2/AMEF2.01 myocyte 0.80 1783-1805 1794 ID enhancer factor NO: 305 SEQ V$OCTB/TST1.01 POU-factor Tst- 0.87 1787-1801 1794 ID 1/Oct-6 NO: 306 SEQ V$HOXF/HOXA9.01 Member of the 0.87 1787-1803 1795 ID vertebrate HOX- NO: cluster of 307 homeobox factors SEQ V$BRNF/BRN2.01 POU factor Brn-2 0.91 1788-1804 1796 ID (N-Oct 3) NO: 308 SEQ V$PARF/DBP.01 Albumin D-box 0.84 1791-1805 1798 ID binding protein NO: 309 SEQ V$OCT1/OCT1.02 octamer-binding 0.82 1795-1809 1802 ID factor 1 NO: 310 SEQ V$FKHD/FREAC2.01 Fork head 0.84 1816-1832 1824 ID RElated NO: ACtivator-2 311 SEQ V$SORY/SOX5.01 Sox-5 0.87 1821-1837 1829 ID NO: 312 SEQ V$AREB/AREB6.04 AREB6 (Atp1a1 0.98 1837-1849 1843 ID regulatory NO: element binding 313 factor 6) SEQ V$MYT1/MYT1.02 MyT1 zinc finger 0.88 1848-1860 1854 ID transcription NO: factor involved in 314 primary neurogenesis SEQ V$RBPF/RBPJK.01 Mammalian 0.84 1851-1865 1858 ID transcriptional NO: repressor RBP- 315 Jkappa/CBF1 SEQ V$OCT1/OCT1.02 octamer-binding 0.82 1875-1889 1882 ID factor 1 NO: 316 SEQ V$FKHD/FREAC4.01 Fork head 0.78 1875-1891 1883 ID RElated NO: ACtivator-4 317 SEQ V$EBOX/MYCMAX.02 c-Myc/Max 0.92 1880-1896 1888 ID heterodimer NO: 318 SEQ V$PAX6/PAX6.01 Pax-6 paired 0.75 1880-1898 1889 ID domain protein NO: 319 SEQ V$IRFF/IRF3.01 Interferon 0.86 1891-1905 1898 ID regulatory factor NO: 3 (IRF-3) 320 SEQ V$HNF1/HNF1.02 Hepatic nuclear 0.76 1895-1911 1903 ID factor 1 NO: 321 SEQ V$FKHD/FREAC2.01 Fork head 0.84 1898-1914 1906 ID RElated NO: ACtivator-2 322 SEQ V$E4FF/E4F.01 GLI-Krueppel- 0.82 1902-1914 1908 ID related NO: transcription 323 factor, regulator of adenovirus E4 promoter SEQ V$CREB/CREBP1.01 cAMP-responsive 0.80 1900-1920 1910 ID element binding NO: protein 1 324 SEQ V$VBPF/VBP.01 PAR-type chicken 0.86 1905-1915 1910 ID vitellogenin NO: promoter-binding 325 protein SEQ V$MYT1/MYT1.01 MyT1 zinc finger 0.75 1912-1924 1918 ID transcription NO: factor involved in 326 primary neurogenesis SEQ V$HNF1/HNF1.01 hepatic nuclear 0.78 1913-1929 1921 ID factor 1 NO: 327 SEQ V$PCAT/CAAT.01 cellular and viral 0.90 1928-1938 1933 ID CCAAT box NO: 328 SEQ V$HNF6/HNF6.01 Liver enriched 0.82 1929-1943 1936 ID Cut- NO: Homeodomain 329 transcription factor HNF6 (ONECUT) SEQ V$PXRF/PXRCAR.01 Halfsite of PXR 0.98 1935-1945 1940 ID (pregnane X NO: receptor)/RXR 330 resp. CAR (constitutive androstane receptor)/RXR heterodimer binding site SEQ V$RARF/RTR.01 Retinoid 0.81 1934-1952 1943 ID receptor-related NO: testis-associated 331 receptor (GCNF/RTR) SEQ V$HOXF/EN1.01 Homeobox 0.77 1936-1952 1944 ID protein engrailed NO: (en-1) 332 SEQ V$NKXH/NKX25.01 homeo domain 1.00 1939-1951 1945 ID factor Nkx- NO: 2.5/Csx, tinman 333 homolog, high affinity sites SEQ V$GATA/GATA3.02 GATA-binding 0.91 1953-1965 1959 ID factor 3 NO: 334 SEQ V$TBPF/TATA.01 cellular and viral 0.90 1968-1984 1976 ID TATA box NO: elements 335 SEQ V$SRFF/SRF.01 serum response 0.66 1969-1987 1978 ID factor NO: 336 SEQ V$CLOX/CDPCR3.01 cut-like 0.75 1972-1988 1980 ID homeodomain NO: protein 337 SEQ V$PAX1/PAX1.01 Pax1 paired 0.61 2016-2034 2025 ID domain protein, NO: expressed in the 338 developing vertebral column of mouse embryos SEQ V$TBPF/ATATA.01 Avian C-type LTR 0.81 2019-2035 2027 ID TATA box NO: 339 SEQ V$GFI1/GfI1B.01 Growth factor 0.82 2021-2035 2028 ID independence 1 NO: zinc finger 340 protein Gfi-1B SEQ V$NRSF/NRSF.01 neuron-restrictive 0.69 2025-2045 2035 ID silencer factor NO: 341 SEQ V$NFAT/NFAT.01 Nuclear factor of 0.97 2033-2043 2038 ID activated T-cells NO: 342 SEQ V$AREB/AREB6.04 AREB6 (Atp1a1 0.98 2034-2046 2040 ID regulatory NO: element binding 343 factor 6) SEQ V$HNF1/HNF1.01 hepatic nuclear 0.78 2036-2052 2044 ID factor 1 NO: 344 SEQ V$FKHD/XFD3.01 Xenopus fork 0.82 2038-2054 2046 ID head domain NO: factor 3 345 SEQ V$PDX1/PDX1.01 Pdx1 (IDX1/IPF1) 0.74 2036-2056 2046 ID pancreatic and NO: intestinal 346 homeodomain TF SEQ V$OCT1/OCT1.01 octamer-binding 0.77 2050-2064 2057 ID factor 1 NO: 347 SEQ V$TBPF/TATA.01 cellular and viral 0.90 2053-2069 2061 ID TATA box NO: elements 348 SEQ V$ETSF/GABP.01 GABP: GA 0.85 2080-2096 2088 ID binding protein NO: 349 SEQ V$BEL1/BEL1.01 Bel-1 similar 0.78 2083-2105 2094 ID region (defined in NO: Lentivirus LTRs) 350 SEQ V$VMYB/VMYB.01 v-Myb 0.90 2097-2107 2102 ID NO: 351 SEQ V$GREF/ARE.01 Androgene 0.80 2106-2124 2115 ID receptor binding NO: site 352 SEQ V$PDX1/PDX1.01 Pdx1 (IDX1/IPF1) 0.74 2137-2157 2147 ID pancreatic and NO: intestinal 353 homeodomain TF SEQ V$MYOD/MYOD.02 myoblast 0.98 2154-2168 2161 ID determining NO: factor 354 SEQ V$GATA/GATA1.03 GATA-binding 0.95 2169-2181 2175 ID factor 1 NO: 355 SEQ V$AP4R/TAL1BETAE47.01 Tal-1beta/E47 0.87 2179-2195 2187 ID heterodimer NO: 356 SEQ V$OAZF/ROAZ.01 Rat C2H2 Zn 0.73 2204-2220 2212 ID finger protein NO: involved in 357 olfactory neuronal differentiation SEQ V$GATA/GATA1.01 GATA-binding 0.96 2217-2229 2223 ID factor 1 NO: 358 SEQ V$MYOD/E47.02 TAL1/E47 dimers 0.93 2220-2234 2227 ID NO: 359 SEQ V$LTUP/TAACC.01 Lentiviral TATA 0.71 2225-2247 2236 ID upstream NO: element 360 SEQ V$RREB/RREB1.01 Ras-responsive 0.79 2239-2253 2246 ID element binding NO: protein 1 361 SEQ V$OCT1/OCT1.05 octamer-binding 0.90 2251-2265 2258 ID factor 1 NO: 362 SEQ V$OCT1/OCT1.02 octamer-binding 0.82 2282-2296 2289 ID factor 1 NO: 363 SEQ V$COUP/COUP.01 COUP 0.81 2284-2298 2291 ID antagonizes HNF- NO: 4 by binding site 364 competition or synergizes by direct protein - protein interaction with HNF-4 SEQ V$MEF2/MEF2.01 myogenic 0.74 2290-2312 2301 ID enhancer factor 2 NO: 365 SEQ V$CDXF/CDX2.01 Cdx-2 0.84 2296-2314 2305 ID mammalian NO: caudal related 366 intestinal transcr. factor SEQ V$MYT1/MYT1.01 MyT1 zinc finger 0.75 2301-2313 2307 ID transcription NO: factor involved in 367 primary neurogenesis SEQ V$NFAT/NFAT.01 Nuclear factor of 0.97 2314-2324 2319 ID activated T-cells NO: 368 SEQ V$CIZF/NMP4.01 NMP4 (nuclear 0.97 2317-2327 2322 ID matrix protein 4)/ NO: CIZ (Cas- 369 interacting zinc finger protein) SEQ V$GATA/GATA3.02 GATA-binding 0.91 2326-2338 2332 ID factor 3 NO: 370 SEQ V$HMTB/MTBF.01 muscle-specific 0.90 2351-2359 2355 ID Mt binding site NO: 371 SEQ V$NOLF/OLF1.01 olfactory neuron- 0.82 2350-2372 2361 ID specific factor NO: 372 SEQ V$PDX1/PDX1.01 Pdx1 (IDX1/IPF1) 0.74 2363-2383 2373 ID pancreatic and NO: intestinal 373 homeodomain TF SEQ V$GATA/GATA3.02 GATA-binding 0.91 2395-2407 2401 ID factor 3 NO: 374 SEQ V$NFAT/NFAT.01 Nuclear factor of 0.97 2406-2416 2411 ID activated T-cells NO: 375 SEQ V$OCTP/OCT1P.01 octamer-binding 0.86 2433-2445 2439 ID factor 1, POU- NO: specific domain 376 SEQ V$MITF/MIT.01 MIT 0.81 2438-2456 2447 ID (microphthalmia NO: transcription 377 factor) and TFE3 SEQ V$PAX8/PAX8.01 PAX 2/5/8 0.88 2441-2453 2447 ID binding site NO: 378 SEQ V$TBPF/ATATA.01 Avian C-type LTR 0.81 2451-2467 2459 ID TATA box NO: 379 SEQ V$GATA/GATA3.02 GATA-binding 0.91 2462-2474 2468 ID factor 3 NO: 380 SEQ V$CLOX/CLOX.01 Clox 0.81 2462-2478 2470 ID NO: 381 SEQ V$HNF6/HNF6.01 Liver enriched 0.82 2464-2478 2471 ID Cut - NO: Homeodomain 382 transcription factor HNF6 (ONECUT) SEQ V$PIT1/PIT1.01 Pit1, GHF-1 0.86 2468-2478 2473 ID pituitary specific NO: pou domain 383 transcription factor SEQ V$AP4R/TAL1BETAITF2.01 Tal-1beta/ITF-2 0.85 2469-2485 2477 ID heterodimer NO: 384 SEQ V$CIZF/NMP4.01 NMP4 (nuclear 0.97 2477-2487 2482 ID matrix protein 4)/ NO: CIZ (Cas- 385 interacting zinc finger protein) SEQ V$NFAT/NFAT.01 Nuclear factor of 0.97 2480-2490 2485 ID activated T-cells NO: 386 SEQ V$STAT/STAT.01 signal 0.87 2479-2497 2488 ID transducers and NO: activators of 387 transcription SEQ V$TBPF/TATA.02 Mammalian C- 0.89 2484-2500 2492 ID type LTR TATA NO: box 388 SEQ V$FKHD/XFD3.01 Xenopus fork 0.82 2501-2517 2509 ID head domain NO: factor 3 389 SEQ V$AP1F/AP1.01 AP1 binding site 0.95 2500-2520 2510 ID NO: 390 SEQ V$AP1F/AP1.01 AP1 binding site 0.95 2504-2524 2514 ID NO: 391 SEQ V$PCAT/CAAT.01 cellular and viral 0.90 2513-2523 2518 ID CCAAT box NO: 392 SEQ V$CDXF/CDX2.01 Cdx-2 0.84 2524-2542 2533 ID mammalian NO: caudal related 393 intestinal transcr. factor SEQ V$MYT1/MYT1.02 MyT1 zinc finger 0.88 2539-2551 2545 ID transcription NO: factor involved in 394 primary neurogenesis SEQ V$ETSF/FLI.01 ETS family 0.81 2560-2576 2568 ID member FLI NO: 395 SEQ V$MYT1/MYT1.01 MyT1 zinc finger 0.75 2569-2581 2575 ID transcription NO: factor involved in 396 primary neurogenesis SEQ V$TBPF/ATATA.01 Avian C-type LTR 0.81 2576-2592 2584 ID TATA box NO: 397 SEQ V$SATB/SATB1.01 Special AT-rich 0.93 2578-2594 2586 ID sequence-binding NO: protein 1, 398 predominantly expressed in thymocytes, binds to matrix attachment regions (MARs) SEQ V$NKXH/NKX31.01 prostate-specific 0.84 2584-2596 2590 ID homeodomain NO: protein NKX3.1 399 SEQ V$PARF/DBP.01 Albumin D-box 0.84 2589-2603 2596 ID binding protein NO: 400 SEQ V$PAX5/PAX5.02 B-cell-specific 0.75 2591-2619 2605 ID activating protein NO: 401 SEQ V$ECAT/NFY.03 nuclear factor Y 0.80 2604-2618 2611 ID (Y-box binding NO: factor) 402 SEQ V$GFI1/GFI1.01 Growth factor 0.97 2608-2622 2615 ID independence 1 NO: zinc finger 403 protein acts as transcriptional repressor SEQ V$HNF6/HNF6.01 Liver enriched 0.82 2608-2622 2615 ID Cut - NO: Homeodomain 404 transcription factor HNF6 (ONECUT) SEQ V$MYT1/MYT1.01 MyT1 zinc finger 0.75 2610-2622 2616 ID transcription NO: factor involved in 405 primary neurogenesis SEQ V$PAX8/PAX8.01 PAX 2/5/8 0.88 2610-2622 2616 ID binding site NO: 406 SEQ V$TTFF/TTF1.01 Thyroid 0.92 2609-2623 2616 ID transcription NO: factor-1 (TTF1) 407 binding site SEQ V$MYT1/MYT1.02 MyT1 zinc finger 0.88 2612-2624 2618 ID transcription NO: factor involved in 408 primary neurogenesis SEQ V$CDXF/CDX2.01 Cdx-2 0.84 2612-2630 2621 ID mammalian NO: caudal related 409 intestinal transcr. factor SEQ V$SORY/HMGIY.01 HMGI(Y) high- 0.92 2649-2665 2657 ID mobility-group NO: protein I (Y), 410 architectural transcription factor organizing the framework of a nuclear protein- DNA transcriptional complex SEQ V$HOXF/EN1.01 Homeobox 0.77 2657-2673 2665 ID protein engrailed NO: (en-1) 411 SEQ V$OCT1/OCT1.06 octamer-binding 0.80 2662-2676 2669 ID factor 1 NO: 412 SEQ V$BCL6/BCL6.01 POZ/zinc finger 0.76 2683-2699 2691 ID protein, NO: transcriptional 413 repressor, translocations observed in diffuse large cell lymphoma SEQ V$OCTP/OCT1P.01 octamer-binding 0.86 2715-2727 2721 ID factor 1, POU- NO: specific domain 414 SEQ V$TEAF/TEF1.01 TEF-1 related 0.84 2722-2734 2728 ID muscle factor NO: 415 SEQ V$GFI1/GFI1.01 Growth factor 0.97 2723-2737 2730 ID independence 1 NO: zinc finger 416 protein acts as transcriptional repressor SEQ V$HOXT/MEIS1_HOXA9.01 Homeobox 0.79 2729-2741 2735 ID protein MEIS1 NO: binding site 417 SEQ V$HOXF/HOXA9.01 Member of the 0.87 2728-2744 2736 ID vertebrate HOX - NO: cluster of 418 homeobox factors SEQ V$PARF/DBP.01 Albumin D-box 0.84 2729-2743 2736 ID binding protein NO: 419 SEQ V$VBPF/VBP.01 PAR-type chicken 0.86 2732-2742 2737 ID vitellogenin NO: promoter-binding 420 protein SEQ V$CREB/E4BP4.01 E4BP4, bZIP 0.80 2728-2748 2738 ID domain, NO: transcriptional 421 repressor SEQ V$OCT1/OCT1.01 octamer-binding 0.77 2733-2747 2740 ID factor 1 NO: 422 SEQ V$FKHD/XFD1.01 Xenopus fork 0.90 2733-2749 2741 ID head domain NO: factor 1 423 SEQ V$SRFF/SRF.01 serum response 0.66 2736-2754 2745 ID factor NO: 424 SEQ V$OCTP/OCT1P.01 octamer-binding 0.86 2746-2758 2752 ID factor 1, POU- NO: specific domain 425 SEQ V$CLOX/CDPCR3.01 cut-like 0.75 2748-2764 2756 ID homeodomain NO: protein 426 SEQ V$TBPF/TATA.01 cellular and viral 0.90 2749-2765 2757 ID TATA box NO: elements 427 SEQ V$SRFF/SRF.01 serum response 0.66 2750-2768 2759 ID factor NO: 428 SEQ V$TBPF/ATATA.01 Avian C-type LTR 0.81 2759-2775 2767 ID TATA box NO: 429 SEQ V$TBPF/TATA.02 Mammalian C- 0.89 2762-2778 2770 ID type LTR TATA NO: box 430 SEQ V$CABL/CABL.01 Multifunctional c- 0.97 2769-2779 2774 ID Abl src type NO: tyrosine kinase 431 SEQ V$LEFF/LEF1.01 TCF/LEF-1, 0.86 2766-2782 2774 ID involved in the NO: Wnt signal 432 transduction pathway SEQ V$OCT1/OCT1.06 octamer-binding 0.80 2775-2789 2782 ID factor 1 NO: 433 SEQ V$MEF2/MMEF2.01 myocyte 0.90 2776-2798 2787 ID enhancer factor NO: 434 SEQ V$OCT1/OCT1.06 octamer-binding 0.80 2780-2794 2787 ID factor 1 NO: 435 SEQ V$TBPF/TATA.01 cellular and viral 0.90 2779-2795 2787 ID TATA box NO: elements 436 SEQ V$CART/CART1.01 Cart-1 (cartilage 0.84 2780-2796 2788 ID homeoprotein 1) NO: 437 SEQ V$FKHD/XFD2.01 Xenopus fork- 0.89 2780-2796 2788 ID head domain NO: factor 2 438 SEQ V$MEF2/MEF2.05 MEF2 0.96 2778-2800 2789 ID NO: 439 SEQ V$BRNF/BRN3.01 POU transcription 0.78 2785-2801 2793 ID factor Brn-3 NO: 440 SEQ V$TBPF/TATA.01 cellular and viral 0.90 2786-2802 2794 ID TATA box NO: elements 441 SEQ V$GFI1/GFI1.01 Growth factor 0.97 2791-2805 2798 ID independence 1 NO: zinc finger 442 protein acts as transcriptional repressor SEQ V$HOXT/MEIS1_HOXA9.01 Homeobox 0.79 2797-2809 2803 ID protein MEIS1 NO: binding site 443 SEQ V$MEF2/MMEF2.01 myocyte 0.90 2792-2814 2803 ID enhancer factor NO: 444 SEQ V$MEF2/MEF2.05 MEF2 0.96 2795-2817 2806 ID NO: 445 SEQ V$MEF2/MMEF2.01 myocyte 0.90 2797-2819 2808 ID enhancer factor NO: 446 SEQ V$HNF1/HNF1.01 hepatic nuclear 0.78 2802-2818 2810 ID factor 1 NO: 447 SEQ V$MEF2/MEF2.01 myogenic 0.74 2799-2821 2810 ID enhancer factor 2 NO: 448 SEQ V$HOXF/HOX1-3.01 Hox-1.3, 0.83 2814-2830 2822 ID vertebrate NO: homeobox 449 protein SEQ V$PARF/DBP.01 Albumin D-box 0.84 2816-2830 2823 ID binding protein NO: 450 SEQ V$PDX1/ISL1.01 Pancreatic and 0.82 2814-2834 2824 ID intestinal lim- NO: homeodomain 451 factor SEQ V$GATA/GATA1.02 GATA-binding 0.99 2819-2831 2825 ID factor 1 NO: 452 SEQ V$HEAT/HSF1.01 heat shock factor 1 0.93 2845-2855 2850 ID NO: 453 SEQ V$MYT1/MYT1.02 MyT1 zinc finger 0.88 2853-2865 2859 ID transcription NO: factor involved in 454 primary neurogenesis SEQ V$BCL6/BCL6.02 POZ/zinc finger 0.77 2857-2873 2865 ID protein, NO: transcriptional 455 repressor, translocations observed in diffuse large cell lymphoma SEQ V$TTFF/TTF1.01 Thyroid 0.92 2863-2877 2870 ID transcription NO: factor-1 (TTF1) 456 binding site SEQ V$EBOX/USF.02 upstream 0.94 2868-2884 2876 ID stimulating factor NO: 457 SEQ V$HOXF/PTX1.01 Pituitary 0.79 2892-2908 2900 ID Homeobox 1 NO: (Ptx1) 458 SEQ V$MYOD/LMO2COM.01 complex of Lmo2 0.98 2901-2915 2908 ID bound to Tal-1, NO: E2A proteins, and 459 GATA-1, half-site 1 SEQ V$REBV/EBVR.01 Epstein-Barr 0.81 2904-2924 2914 ID virus NO: transcription 460 factor R SEQ V$ETSF/PU1.01 Pu.1 (Pu120) Ets- 0.86 2932-2948 2940 ID like transcription NO: factor identified 461 in lymphoid B- cells SEQ V$MITF/MIT.01 MIT 0.81 2943-2961 2952 ID (microphthalmia NO: transcription 462 factor) and TFE3 SEQ V$HAML/AML1.01 runt-factor AML-1 1.00 2950-2964 2957 ID NO: 463 SEQ V$NFKB/CREL.01 c-Rel 0.91 2954-2968 2961 ID NO: 464 SEQ V$IKRS/IK3.01 Ikaros 3, 0.84 2958-2970 2964 ID potential NO: regulator of 465 lymphocyte differentiation SEQ V$RBPF/RBPJK.01 Mammalian 0.84 2957-2971 2964 ID transcriptional NO: repressor RBP- 466 Jkappa/CBF1 SEQ V$E2FF/E2F.01 E2F, involved in 0.74 2966-2980 2973 ID cell cycle NO: regulation, 467 interacts with Rb p107 protein SEQ V$E4FF/E4F.01 GLI-Krueppel- 0.82 2968-2980 2974 ID related NO: transcription 468 factor, regulator of adenovirus E4 promoter SEQ V$CREB/ATF6.02 Activating 0.85 2966-2986 2976 ID transcription NO: factor 6, member 469 of b-zip family, induced by ER stress SEQ V$EBOX/ARNT.01 AhR nuclear 0.89 2968-2984 2976 ID translocator NO: homodimers 470 SEQ V$E4FF/E4F.01 GLI-Krueppel- 0.82 2971-2983 2977 ID related NO: transcription 471 factor, regulator of adenovirus E4 promoter SEQ V$EBOR/XBP1.01 X-box-binding 0.86 2970-2984 2977 ID protein 1 NO: 472 SEQ V$E2FF/E2F.01 E2F, involved in 0.74 2971-2985 2978 ID cell cycle NO: regulation, 473 interacts with Rb p107 protein SEQ V$STAT/STAT.01 signal 0.87 2989-3007 2998 ID transducers and NO: activators of 474 transcription SEQ V$BCL6/BCL6.02 POZ/zinc finger 0.77 2991-3007 2999 ID protein, NO: transcriptional 475 repressor, translocations observed in diffuse large cell lymphoma SEQ V$XSEC/STAF.01 Se-Cys tRNA 0.77 3003-3025 3014 ID gene NO: transcription 476 activating factor SEQ V$NF1F/NF1.01 Nuclear factor 1 0.94 3007-3025 3016 ID NO: 477 SEQ V$OCT1/OCT1.02 octamer-binding 0.82 3014-3028 3021 ID factor 1 NO: 478 SEQ V$RCAT/CLTR_CAAT.01 Mammalian C- 0.75 3019-3043 3031 ID type LTR CCAAT NO: box 479 SEQ V$SF1F/SF1.01 SF1 steroidogenic 0.95 3033-3045 3039 ID factor 1 NO: 480 SEQ V$OCT1/OCT1.01 octamer-binding 0.77 3038-3052 3045 ID factor 1 NO: 481 SEQ V$PARF/DBP.01 Albumin D-box 0.84 3042-3056 3049 ID binding protein NO: 482 SEQ V$ETSF/ETS1.01 c-Ets-1 binding 0.92 3057-3073 3065 ID site NO: 483 SEQ V$LEFF/LEF1.01 TCF/LEF-1, 0.86 3062-3078 3070 ID involved in the NO: Wnt signal 484 transduction pathway SEQ V$MAZF/MAZ.01 Myc associated 0.90 3072-3084 3078 ID zinc finger NO: protein (MAZ) 485 SEQ V$SP1F/GC.01 GC box elements 0.88 3071-3085 3078 ID NO: 486 SEQ V$TBPF/TATA.01 cellular and viral 0.90 3091-3107 3099 ID TATA box NO: 487 elements SEQ V$SEF1/SEF1.01 SEF1 binding site 0.69 3099-3117 3108 ID NO: 488 SEQ ID Core Matrix NO: Str. sim. sim. Sequence SEQ (+) 1.000 0.900 ggaccatCAAAgtctgt ID NO: 41 SEQ (+) 1.000 0.823 agtctgtCATGtcatttgg ID NO: 42 SEQ (+) 0.833 0.904 gTCATgtcatttggg ID NO: 43 SEQ (+) 1.000 1.000 GTCAttt ID NO: 44 SEQ (+) 1.000 0.735 ctgtcatgtcatTTGGgggagggcctatg ID NO: 45 SEQ (−) 1.000 0.982 gccctCCCCcaaa ID NO: 46 SEQ (+) 0.876 0.898 tgggGGAGggcctat ID NO: 47 SEQ (−) 0.884 0.708 acagaggagggcATAGgccct ID NO: 48 SEQ (−) 0.800 0.811 cagataCACAgaggagggcataggccctc ID NO: 49 SEQ (−) 1.000 0.987 tgctattTAAGcccaga ID NO: 50 SEQ (−) 1.000 0.932 tgctATTTa ID NO: 51 SEQ (−) 0.750 0.865 ggtatgctATTTaag ID NO: 52 NO: 65 SEQ (−) 1.000 0.894 ttatAAAGctgagga ID NO: 66 SEQ (−) 1.000 0.910 agttaTAAAgctgagga ID NO: 67 SEQ (−) 1.000 0.902 agtgAAAGcagagag ID NO: 68 SEQ (−) 0.750 0.756 craCAGTtgacct ID NO: 69 SEQ (+) 1.000 0.940 GTCTtgact ID NO: 70 SEQ (−) 1.000 0.960 gagggATTAgaaaagga ID NO: 71 SEQ (−) 1.000 0.906 ggaatCCAAtygtag ID NO: 72 SEQ (+) 0.789 0.802 ctacraTTGGattccat ID NO: 73 SEQ (−) 1.000 0.897 tacagcTAAAcactgag ID NO: 74 SEQ (−) 0.840 0.865 gagcctTCATccagtagct ID NO: 75 SEQ (−) 1.000 0.746 tgtcatcttagagCCTTcatc ID NO: 76 SEQ (+) 1.000 0.861 agGCTCtaagatg ID NO: 77 SEQ (+) 0.750 0.837 tcTAAGatgacaattaa ID NO: 78 SEQ (+) 0.807 0.840 aaGATGacaattaag ID NO: 79 SEQ (+) 1.000 0.994 gacaATTAa ID NO: 80 SEQ (−) 1.000 1.000 cctTAATtgtcat ID NO: 81 SEQ (−) 0.766 0.808 cgacgattACCTtaattgtca ID NO: 82 SEQ (−) 0.750 0.768 aatgaggATCGacgattacct ID NO: 83 SEQ (+) 1.000 0.886 cgatcctcATTAtagtg ID NO: 84 SEQ (+) 1.000 0.868 tatagtGGAAgggcttc ID NO: 85 SEQ (+) 1.000 0.904 agggcttCAAAggcagt ID NO: 86 SEQ (−) 0.758 0.867 gagacTGCCtttgaagccc ID NO: 87 SEQ (−) 1.000 0.971 ttcaGATAggcag ID NO: 88 SEQ (−) 0.757 0.672 atgttcaGATAggcagtag ID NO: 89 SEQ (−) 0.800 0.824 gGAAAtgttcagatagg ID NO: 90 SEQ (+) 1.000 0.951 cctaatgCCAGatgtct ID NO: 91 SEQ (+) 1.000 0.852 aatgcCAGAtgtctctt ID NO: 92 SEQ (−) 1.000 0.851 gagaCATCtggca ID NO: 93 SEQ (−) 1.000 0.984 aggataggttTAAAgagacatct ID NO: 94 SEQ (−) 1.000 0.774 gGATAggtttaaagaga ID NO: 95 SEQ (+) 1.000 0.813 tgtctcttTAAAcctatcctggc ID NO: 96 SEQ (+) 1.000 0.877 ctcttTAAAcctatcct ID NO: 97 SEQ (+) 1.000 0.845 ctcccttcATTAaggta ID NO: 98 SEQ (−) 1.000 0.834 gagatacctTAATgaagggag ID NO: 99 SEQ (+) 0.944 0.926 gGTATctcatttttt ID NO: 100 SEQ (−) 1.000 0.972 gcAAAAaatga ID NO: 101 SEQ (−) 0.764 0.720 ggaaCAGAggagagcaa ID NO: 102 SEQ (−) 0.881 0.964 aaaactgaATCAgtggnggaa ID NO: 103 SEQ (+) 1.000 0.886 actgATTCagt ID NO: 104 SEQ (+) 0.850 0.956 nccactgaTTCAgtttttctg ID NO: 105 SEQ (−) 0.876 0.910 aaaAACTgaat ID NO: 106 SEQ (−) 1.000 0.975 agAAAAactga ID NO: 107 SEQ (+) 1.000 0.875 ctgatccctctTGTTctcc ID NO: 108 SEQ (−) 1.000 0.971 gaaaaagagaAGGGa ID NO: 109 SEQ (−) 1.000 0.987 ggAAAAagaga ID NO: 110 SEQ (−) 1.000 0.982 ggagGAAAaag ID NO: 111 SEQ (−) 1.000 0.910 ggtgGAGGgaagg ID NO: 112 SEQ (−) 1.000 0.932 gggggTGGGagggtg ID NO: 113 SEQ (+) 1.000 0.972 tcccaCCCCcatg ID NO: 114 SEQ (−) 1.000 0.815 aggaagggGAAAggg ID NO: 115 SEQ (−) 1.000 0.911 aaaataggAAATaagga ID NO: 116 SEQ (−) 1.000 0.883 aaaataGGAAataagga ID NO: 117 SEQ (−) 0.760 0.792 aGAGAaaataggaaata ID NO: 118 SEQ (−) 0.763 0.817 cccccagagaaAATAgg ID NO: 119 SEQ (−) 1.000 0.934 ccacaCCCCcaga ID NO: 120 SEQ (+) 0.983 0.894 gggtgtgGATTttat ID NO: 121 SEQ (−) 1.000 0.942 caccaTAAAatccacac ID NO: 122 SEQ (−) 0.866 0.813 aacataTGCAcagaagggcttccaccata ID NO: 123 SEQ (−) 1.000 0.790 catATGCacagaagg ID NO: 124 SEQ (−) 1.000 0.910 caacatATGCaca ID NO: 125 SEQ (+) 0.757 0.666 ctgtgcaTATGttgtctta ID NO: 126 SEQ (−) 0.750 0.828 caataagacaaCATAtg ID NO: 127 SEQ (−) 1.000 0.776 ccAATAagacaacatat ID NO: 128 SEQ (−) 1.000 0.836 tcaaccaatAAGAcaac ID NO: 129 SEQ (−) 1.000 0.960 atcaaCCAAtaagac ID NO: 130 SEQ (+) 1.000 0.844 tcttatTGGTtgata ID NO: 131 SEQ (−) 1.000 0.943 tcaaCCAAtaa ID NO: 132 SEQ (+) 1.000 0.956 ggttGATAaataa ID NO: 133 SEQ (+) 0.757 0.791 gGTTGataaataaagca ID NO: 134 SEQ (−) 0.750 0.797 gTGCTttatttat ID NO: 135 SEQ (+) 1.000 0.912 gttgtCCAAtaggga ID NO: 136 SEQ (+) 0.750 0.843 aataggGAAAcaagata ID NO: 137 SEQ (+) 1.000 0.960 tagggaaacaAGATagg ID NO: 138 SEQ (+) 1.000 0.970 acaaGATAggtgg ID NO: 139 SEQ (−) 0.750 0.867 cccaCCTAtct ID NO: 140 SEQ (−) 1.000 0.929 ggatcacatgGCAAccctc ID NO: 141 SEQ (−) 0.895 0.936 ggatCACAtggcaacc ID NO: 142 SEQ (+) 1.000 0.863 gggttgcCATGtgatccta ID NO: 143 SEQ (+) 1.000 0.950 ctaggaGGAAttgacac ID NO: 144 SEQ (−) 1.000 0.800 catgtgtcAATTcct ID NO: 145 SEQ (−) 1.000 1.000 tGTCAat ID NO: 146 SEQ (−) 1.000 0.835 ccattctCATGtgtcaatt ID NO: 147 SEQ (+) 0.846 0.800 caCATGagaatgggg ID NO: 148 SEQ (+) 1.000 0.998 gaaaGATAagtcc ID NO: 149 SEQ (−) 1.000 0.672 atattttTATAaggactta ID NO: 150 SEQ (−) 1.000 0.867 atatattTTTAtaaggact ID NO: 151 SEQ (+) 1.000 0.894 tccttaTAAAaatatat ID NO: 152 SEQ (+) 1.000 0.740 agtccttaTAAAaatatatatta ID NO: 153 SEQ (+) 1.000 0.963 ccttaTAAAaatatata ID NO: 154 SEQ (−) 1.000 0.870 acTAATatatattttta ID NO: 155 SEQ (−) 1.000 0.855 caTAATtactaatatat ID NO: 156 SEQ (−) 1.000 0.943 cataattacTAATatat ID NO: 157 SEQ (−) 1.000 0.816 cccATAAttactaatat ID NO: 158 SEQ (−) 0.757 0.765 ccCATAattactaatat ID NO: 159 SEQ (+) 1.000 0.989 agtaATTAt ID NO: 160 SEQ (−) 1.000 0.976 ccatAATTactaa ID NO: 161 SEQ (−) 1.000 0.886 aacccataATTActaat ID NO: 162 SEQ (−) 1.000 0.775 attaacccaTAATtactaata ID NO: 163 SEQ (+) 0.826 0.844 tatgggttAATAattaa ID NO: 164 SEQ (−) 0.755 0.857 aCTTAattattaaccca ID NO: 165 SEQ (+) 1.000 0.966 gGTTAataattaagtca ID NO: 166 SEQ (+) 1.000 0.972 taatAATTaag ID NO: 167 SEQ (−) 1.000 0.995 cttaATTAt ID NO: 168 SEQ (−) 1.000 0.873 ctgacttaATTAttaac ID NO: 169 SEQ (+) 1.000 0.988 taatAATaagtc ID NO: 170 SEQ (+) 1.000 0.931 taataATTAagtc ID NO: 171 SEQ (+) 1.000 0.881 taataatTAAGtcagag ID NO: 172 SEQ (−) 0.766 0.819 tagctctgACTTaattattaa ID NO: 173 SEQ (+) 0.750 0.874 ataattaAGTCagagct ID NO: 174 SEQ (+) 0.856 0.928 ctagCCATtaa ID NO: 175 SEQ (−) 1.000 0.903 tctTAATggctag ID NO: 176 SEQ (−) 0.770 0.842 ctagtGTTTcttaatggctag ID NO: 177 SEQ (+) 1.000 0.891 gcttcataATTAatata ID NO: 178 SEQ (−) 1.000 0.995 attaATTAt ID NO: 179 SEQ (+) 1.000 0.988 tcatAATTaatat ID NO: 180 SEQ (+) 1.000 0.952 tcataATTAatat ID NO: 181 SEQ (+) 1.000 0.945 ttcataatTAATatagt ID NO: 182 SEQ (−) 1.000 0.885 actatattAATTatg ID NO: 183 SEQ (−) 1.000 0.854 gatactatATTAattat ID NO: 184 SEQ (+) 0.750 0.875 tgtatgttCATTtgg ID NO: 185 SEQ (+) 0.850 0.887 gtatgttCATTtggg ID NO: 186 SEQ (−) 1.000 0.816 cCCCAaatgaacata ID NO: 187 SEQ (−) 1.000 0.849 tcagcccCAAAtgaa ID NO: 188 SEQ (+) 1.000 0.828 tggggcTGACacagttctggg ID NO: 189 SEQ (+) 1.000 0.833 ggggcTGACacagttctggga ID NO: 190 SEQ (+) 0.750 0.791 aggAAGAytactt ID NO: 191 SEQ (−) 0.804 0.820 cctacaATCCatgtacc ID NO: 192 SEQ (−) 1.000 0.864 atagagCAAAggactac ID NO: 193 SEQ (−) 1.000 0.907 catagagCAAAggacta ID NO: 194 SEQ (−) 1.000 0.700 tagacatagagcAAAGgacta ID NO: 195 SEQ (+) 0.804 0.831 gtctaaATCCatatatg ID NO: 196 SEQ (+) 0.833 0.929 ctaaaTCCAtatatg ID NO: 197 SEQ (+) 1.000 0.851 aaatCCATatatgaatgag ID NO: 198 SEQ (−) 1.000 0.761 actcattcatatATGGa ID NO: 199 SEQ (−) 1.000 0.919 actcATTCata ID NO: 200 SEQ (−) 0.807 0.901 tggtATGTa ID NO: 201 SEQ (−) 1.000 0.922 gaaagayAAACatggta ID NO: 202 SEQ (−) 0.789 0.898 gtgAGGTaacccc ID NO: 203 SEQ (+) 1.000 0.854 atgggGTTAcctcactcagga ID NO: 204 SEQ (+) 1.000 0.903 gTTACctcact ID NO: 205 SEQ (−) 0.758 0.870 cgcAGGCaaatgaat ID NO: 206 SEQ (+) 0.758 0.850 tcattTGCCtgcgaatttt ID NO: 207 SEQ (+) 1.000 0.869 tgcgaatTTTAagattcca ID NO: 208 SEQ (−) 1.000 0.990 taaaaCAATggaatctt ID NO: 209 SEQ (−) 1.000 0.931 aggaataaAACAatgga ID NO: 210 SEQ (+) 1.000 0.865 ccattgtTTTAttcctctg ID NO: 211 SEQ (−) 0.894 0.876 gaggAATAaaacaat ID NO: 212 SEQ (+) 1.000 0.824 tcctctgagTAATactccatt ID NO: 213 SEQ (−) 1.000 0.925 ttacaCAATggagtatt ID NO: 214 SEQ (−) 0.901 0.920 ggtacATTAcacaatggagta ID NO: 215 SEQ (−) 1.000 0.871 aTTACacaatg ID NO: 216 SEQ (+) 0.929 0.955 tccattgtGTAAtgtacca ID NO: 217 SEQ (+) 1.000 0.859 tccattgtgTAATgtaccaca ID NO: 218 SEQ (−) 1.000 1.000 aaaatgTGGTacatt ID NO: 219 SEQ (+) 0.750 0.819 aatgtaccacaTTTTctcc ID NO: 220 SEQ (+) 1.000 0.896 taCATTcttcagt ID NO: 221 SEQ (+) 1.000 0.990 caGTTGagg ID NO: 222 SEQ (−) 1.000 0.932 gcaatagCCAGaacctg ID NO: 223 SEQ (−) 1.000 0.945 gcaatagCCAG ID NO: 224 SEQ (−) 1.000 0.951 attTGCAatagcc ID NO: 225 SEQ (+) 1.000 0.853 tggctattGCAAataaccc ID NO: 226 SEQ (+) 1.000 0.809 ctggctattgcAAATaaccctgc ID NO: 227 SEQ (+) 1.000 0.889 ctattgcAAATaacc ID NO: 228 SEQ (−) 1.000 0.900 ggttATTTg ID NO: 229 SEQ (+) 0.975 0.761 acatatgtcattATTGt ID NO: 230 SEQ (+) 0.944 0.938 cATATgtcattattg ID NO: 231 SEQ (+) 1.000 0.836 catatgtcATTAttgta ID NO: 232 SEQ (−) 1.000 0.889 ttcatacaaTAATgacatatg ID NO: 233 SEQ (−) 1.000 0.870 tcataCAATaatgacat ID NO: 234 SEQ (−) 0.944 0.914 aATATgtaaaacaga ID NO: 235 SEQ (−) 1.000 0.856 tttaaaatatGTAAaacagat ID NO: 236 SEQ (+) 1.000 0.886 tTTACatattt ID NO: 237 SEQ (+) 1.000 0.841 tatttTAAAccatctct ID NO: 238 SEQ (−) 1.000 0.783 caagCAATctaga ID NO: 239 SEQ (+) 1.000 0.765 tctctagATTGcttgtaatat ID NO: 240 SEQ (−) 1.000 0.997 tttaaCAATattacaag ID NO: 241 SEQ (+) 1.000 0.885 tattgtTAAAcatagag ID NO: 242 SEQ (+) 1.000 0.839 catagagagTAATaatgctat ID NO: 243 SEQ (−) 1.000 0.872 atagcattATTActctc ID NO: 244 SEQ (−) 0.826 0.843 tttatagcaTTATtactctct ID NO: 245 SEQ (+) 1.000 0.829 agTAATaatgctataaa ID NO: 246 SEQ (−) 1.000 0.906 tttaattTTTAtagcatta ID NO: 247 SEQ (+) 1.000 0.983 aataatgctaTAAAaattaaaaa ID NO: 248 SEQ (−) 0.755 0.805 tTTTAatttttatagca ID NO: 249 SEQ (+) 1.000 0.832 gctataaaAATTaaa ID NO: 250 SEQ (+) 1.000 0.991 tgctaTAAAaattaaaa ID NO: 251 SEQ (−) 1.000 0.989 tttTAATttttat ID NO: 252 SEQ (+) 1.000 0.944 taaaaATTAaaaa ID NO: 253 SEQ (+) 1.000 0.807 ataaaaatTAAAaataatgataa ID NO: 254 SEQ (+) 1.000 0.872 aataatgatAAGAaaga ID NO: 255 SEQ (+) 1.000 0.993 taatGATAagaaa ID NO: 256 SEQ (+) 1.000 0.931 gaaAGATcctata ID NO: 257 SEQ (+) 1.000 0.915 tacAGATgaaaat ID NO: 258 SEQ (+) 0.763 0.867 cagATGAaaatttag ID NO: 259 SEQ (+) 0.985 0.964 aaaatttaGAAAtacttta ID NO: 260 SEQ (−) 0.958 0.866 agcTAAAgtatttct ID NO: 261 SEQ (−) 1.000 0.763 TCGTcagtggtag ID NO: 262 SEQ (+) 1.000 0.923 taccacTGACgaaatttgtat ID NO: 263 SEQ (−) 1.000 0.959 tttaattCCAGacattc ID NO: 264 SEQ (−) 1.000 0.977 cttTAATtccaga ID NO: 265 SEQ (+) 1.000 0.923 ctggaATTAaaga ID NO: 266 SEQ (+) 1.000 0.898 ctggAATTaaagaaa ID NO: 267 SEQ (−) 1.000 0.915 cagTAATttcttt ID NO: 268 SEQ (+) 1.000 0.922 aaagaaattacTGTTcttt ID NO: 269 SEQ (−) 1.000 0.934 ttataTAAAgaacagta ID NO: 270 SEQ (−) 1.000 0.890 attataTAAAgaacagt ID NO: 271 SEQ (−) 0.891 0.923 tattaTATAaagaacag ID NO: 272 SEQ (−) 0.769 0.856 ctattattatATAAagaacag ID NO: 273 SEQ (+) 1.000 0.836 tttatataaTAATagactgta ID NO: 274 SEQ (+) 0.791 0.760 tataataATAGactgtaaaat ID NO: 275 SEQ (+) 1.000 0.912 gactgTAAAatggcaac ID NO: 276 SEQ (+) 0.750 0.817 gtaaaatgGCAActt ID NO: 277 SEQ (+) 1.000 0.907 ctgtaaaatgGCAActttt ID NO: 278 SEQ (−) 1.000 0.882 taaAAGTtgccat ID NO: 279 SEQ (+) 1.000 0.878 tatttgctAATTcac ID NO: 280 SEQ (−) 0.777 0.865 tcctgTGAAttagcaaatatt ID NO: 281 SEQ (+) 1.000 0.969 tgCTAAttcacag ID NO: 282 SEQ (−) 0.850 0.866 tcctgtgAATTagca ID NO: 283 SEQ (+) 0.750 0.788 ttgctaatTCACaggat ID NO: 284 SEQ (−) 1.000 0.973 agAAAAaatcc ID NO: 285 SEQ (−) 1.000 0.908 agatgTTCCaaagaaaaaa ID NO: 286 SEQ (−) 1.000 0.919 ttgttCAGAtgttccaa ID NO: 287 SEQ (+) 1.000 0.953 tgaacaAATTtccctta ID NO: 288 SEQ (+) 0.750 0.757 acaAATTtccctt ID NO: 289 SEQ (+) 1.000 0.771 tttccctTATAtgaatcac ID NO: 290 SEQ (−) 1.000 0.908 agtGATTcatataaggg ID NO: 291 SEQ (−) 1.000 0.797 gTGATtcatataa ID NO: 292 SEQ (−) 1.000 0.912 agtgATTCata ID NO: 293 SEQ (+) 0.881 0.958 ttatatgaATCActtacattt ID NO: 294 SEQ (+) 1.000 0.860 cTTACattttt ID NO: 295 SEQ (+) 0.850 0.829 gcctgttCATTtaaa ID NO: 296 SEQ (−) 1.000 0.832 gtttTTTAaatgaacag ID NO: 297 SEQ (+) 1.000 0.853 tcattTAAAaaactgca ID NO: 298 SEQ (+) 1.000 0.866 actgcAGGAaagttgtg ID NO: 299 SEQ (+) 1.000 0.891 ggaAAGTtgtgat ID NO: 300 SEQ (−) 1.000 1.000 ataAATCacaacttt ID NO: 301 SEQ (−) 1.000 0.931 cattaTAAAtcacaact ID NO: 302 SEQ (−) 1.000 0.933 tgcattatAAATcacaa ID NO: 303 SEQ (+) 1.000 0.924 gTGATttataatg ID NO: 304 SEQ (−) 0.866 0.827 agttgcatTATAaatcacaactt ID NO: 305 SEQ (+) 0.894 0.898 tgtgATTTataatgc ID NO: 306 SEQ (+) 1.000 0.971 tgtGATTtataatgcaa ID NO: 307 SEQ (+) 1.000 0.916 gtgatttaTAATgcaac ID NO: 308 SEQ (+) 0.884 0.891 atttaTAATgcaact ID NO: 309 SEQ (+) 1.000 0.861 ataATGCaactgcac ID NO: 310 SEQ (+) 1.000 0.910 cagtctTAAAcaatgct ID NO: 311 SEQ (+) 1.000 0.992 ttaaaCAATgctaacca ID NO: 312 SEQ (+) 1.000 0.981 actgtGTTTcagc ID NO: 313 SEQ (−) 1.000 0.889 gggAAGTttatgc ID NO: 314 SEQ (−) 1.000 0.878 tgtgTGGGaagttta ID NO: 315 SEQ (+) 0.763 0.826 actATGAaaacacat ID NO: 316 SEQ (+) 1.000 0.786 actatgaaAACAcatgc ID NO: 317 SEQ (+) 0.895 0.920 gaaaaCACAtgcttaaa ID NO: 318 SEQ (−) 0.773 0.791 cctttAAGCatgtgttttc ID NO: 319 SEQ (+) 1.000 0.874 cttaaaggCAAAtct ID NO: 320 SEQ (−) 0.858 0.782 aGGTAaagatttgcctt ID NO: 321 SEQ (−) 1.000 0.853 ctgaggTAAAgatttgc ID NO: 322 SEQ (−) 0.789 0.830 ctgAGGTaaagat ID NO: 323 SEQ (+) 0.766 0.820 aaatctttACCTcagttaact ID NO: 324 SEQ (+) 1.000 0.862 tTTACctcagt ID NO: 325 SEQ (−) 0.750 0.775 gaaTAGTtaactg ID NO: 326 SEQ (+) 1.000 0.811 aGTTAactattccatag ID NO: 327 SEQ (+) 0.856 0.925 agagCCATtga ID NO: 328 SEQ (−) 1.000 0.873 tgaacTCAAtggctc ID NO: 329 SEQ (−) 1.000 0.980 ctTGAActcaa ID NO: 330 SEQ (+) 1.000 0.854 attgagtTCAAgtgcattt ID NO: 331 SEQ (+) 0.782 0.813 tgagTTCAagtgcattt ID NO: 332 SEQ (+) 1.000 1.000 gttcAAGTgcatt ID NO: 333 SEQ (+) 1.000 0.928 agaAGATataatg ID NO: 334 SEQ (−) 0.891 0.912 atataTATAtggccata ID NO: 335 SEQ (+) 1.000 0.777 atggccaTATAtatatata ID NO: 336 SEQ (−) 1.000 0.806 atatatatatatATGGc ID NO: 337 SEQ (−) 0.750 0.675 CTGTgctgatatatatata ID NO: 338 SEQ (+) 0.750 0.827 atatataTCAGcacagt ID NO: 339 SEQ (+) 1.000 0.904 ataTATCagcacagt ID NO: 340 SEQ (+) 1.000 0.704 atcAGCAcagtggaaacagtt ID NO: 341 SEQ (+) 1.000 0.970 agtgGAAAcag ID NO: 342 SEQ (−) 1.000 0.991 taactGTTTccac ID NO: 343 SEQ (−) 1.000 0.798 tGTTAttaactgtttcc ID NO: 344 SEQ (+) 0.826 0.824 aaacagttAATAacatt ID NO: 345 SEQ (+) 1.000 0.749 ggaaacagtTAATaacatttt ID NO: 346 SEQ (−) 1.000 0.863 taTATGctaaaatgt ID NO: 347 SEQ (−) 0.891 0.908 tagtaTATAtgctaaaa ID NO: 348 SEQ (+) 1.000 0.897 gaggctGGAAgggggct ID NO: 349 SEQ (+) 1.000 0.787 gctggaagggggcTCAGcagtta ID NO: 350 SEQ (−) 0.876 0.901 attAACTgctg ID NO: 351 SEQ (+) 0.750 0.840 atagcacatacTATTcttc ID NO: 352 SEQ (+) 0.782 0.747 gtttggtttTCATcacccatg ID NO: 353 SEQ (−) 1.000 0.988 gaacCACCtgacatg ID NO: 354 SEQ (−) 1.000 0.958 tacaGATAgaaat ID NO: 355 SEQ (+) 1.000 0.924 gtaacCAGAtgatacga ID NO: 356 SEQ (−) 0.750 0.762 agGTACccaaggggact ID NO: 357 SEQ (−) 1.000 0.960 aggtGATAgaggt ID NO: 358 SEQ (−) 1.000 0.939 atagCAGGtgataga ID NO: 359 SEQ (+) 0.759 0.710 cacctgctattctCACCcaaaga ID NO: 360 SEQ (+) 1.000 0.805 aCCCAaagacacaca ID NO: 361 SEQ (−) 0.944 0.904 tGTATgtgagtgtgt ID NO: 362 SEQ (+) 1.000 0.854 tgcATGCacatagtt ID NO: 363 SEQ (−) 0.977 0.855 tGAACtatgtgcatg ID NO: 364 SEQ (+) 0.750 0.767 catagttcAAAAaataaaatttt ID NO: 365 SEQ (−) 1.000 0.896 ttaaaatTTTAttttttga ID NO: 366 SEQ (−) 0.750 0.798 taaAATTttattt ID NO: 367 SEQ (+) 1.000 0.991 aaagGAAAaaa ID NO: 368 SEQ (+) 1.000 0.977 ggAAAAaaagc ID NO: 369 SEQ (−) 1.000 0.946 aaaAGATttgagc ID NO: 370 SEQ (−) 1.000 0.901 aggaATTTt ID NO: 371 SEQ (+) 0.806 0.820 taaaatTCCTatgagtgtgtgat ID NO: 372 SEQ (−) 0.782 0.753 tactgacttTGATcacacact ID NO: 373 SEQ (−) 1.000 0.942 cacAGATtatacc ID NO: 374 SEQ (+) 1.000 0.971 tgtgGAAAaca ID NO: 375 SEQ (+) 0.980 0.879 ctcagtATTCaca ID NO: 376 SEQ (−) 1.000 0.827 ctactttCATGtgtgaata ID NO: 377 SEQ (−) 0.850 0.952 cttTCATgtgtga ID NO: 378 SEQ (+) 1.000 0.838 aagtagcTAAGaataaa ID NO: 379 SEQ (−) 1.000 0.960 aatAGATtttatt ID NO: 380 SEQ (+) 0.806 0.819 aataaaATCTattcatc ID NO: 381 SEQ (+) 0.785 0.846 taaaaTCTAttcatc ID NO: 382 SEQ (+) 1.000 0.890 atctATTCatc ID NO: 383 SEQ (−) 1.000 0.881 aaaaaCAGAtgaataga ID NO: 384 SEQ (−) 1.000 0.981 ggAAAAacaga ID NO: 385 SEQ (−) 1.000 0.976 taagGAAAaac ID NO: 386 SEQ (−) 1.000 0.872 aggattttaaGGAAaaaca ID NO: 387 SEQ (+) 1.000 0.897 ttcctTAAAatcctggc ID NO: 388 SEQ (−) 1.000 0.880 actgagtcAACActgta ID NO: 389 SEQ (−) 1.000 0.984 accactgaGTCAacactgtag ID NO: 390 SEQ (+) 0.964 0.984 agtgttgaCTCAgtggttgct ID NO: 391 SEQ (−) 0.826 0.904 gcaaCCACtga ID NO: 392 SEQ (+) 1.000 0.883 tttaaatTTTAtgctcaaa ID NO: 393 SEQ (+) 1.000 0.891 caaAAGTtgaagc ID NO: 394 SEQ (+) 1.000 0.829 tgaaCCGGtaattctac ID NO: 395 SEQ (−) 1.000 0.757 acaAAGTagaatt ID NO: 396 SEQ (−) 0.750 0.816 aagtattTAATacaaag ID NO: 397 SEQ (−) 1.000 0.939 acaagtattTAATacaa ID NO: 398 SEQ (−) 1.000 0.865 taacAAGTattta ID NO: 399 SEQ (+) 1.000 0.882 acttgTTATgcatcg ID NO: 400 SEQ (−) 1.000 0.758 aacttgatttgttgAGCGatgcataacaa ID NO: 401 SEQ (+) 0.750 0.809 ctcaaCAAAtcaagt ID NO: 402 SEQ (+) 1.000 0.976 acaAATCaagtttta ID NO: 403 SEQ (+) 1.000 0.830 acaaaTCAAgtttta ID NO: 404 SEQ (−) 0.750 0.756 taaAACTtgattt ID NO: 405 SEQ (+) 1.000 0.907 aaaTCAAgtttta ID NO: 406 SEQ (+) 1.000 0.936 caaatCAAGttttaa ID NO: 407 SEQ (+) 1.000 0.887 atcAAGTtttaac ID NO: 408 SEQ (+) 1.000 0.883 atcaagtTTTAacacacca ID NO: 409 SEQ (−) 1.000 0.925 ttaaaaAATTtaagata ID NO: 410 SEQ (+) 1.000 0.780 atttTTTAaatgggcat ID NO: 411 SEQ (−) 0.750 0.818 tttatgccCATTtaa ID NO: 412 SEQ (+) 1.000 0.796 ctaTTCCtacagaagtc ID NO: 413 SEQ (+) 1.000 0.860 ctgaaaATGCatt ID NO: 414 SEQ (+) 1.000 0.898 tgCATTcctgatt ID NO: 415 SEQ (−) 1.000 0.981 ataAATCaggaatgc ID NO: 416 SEQ (+) 1.000 0.929 cTGATttatgtaa ID NO: 417 SEQ (+) 1.000 0.964 cctGATTtatgtaaata ID NO: 418 SEQ (+) 1.000 0.861 ctgatTTATgtaaat ID NO: 419 SEQ (−) 1.000 0.929 tTTACataaat ID NO: 420 SEQ (+) 1.000 0.943 cctgatttatGTAAatatatg ID NO: 421 SEQ (+) 1.000 0.895 ttTATGtaaatatat ID NO: 422 SEQ (+) 1.000 0.940 tttatgTAAAtatatgt ID NO: 423 SEQ (+) 1.000 0.691 atgtaaaTATAtgtatata ID NO: 424 SEQ (+) 0.849 0.883 atgtatATACata ID NO: 425 SEQ (+) 0.888 0.755 gtatatacatatATAGc ID NO: 426 SEQ (−) 0.891 0.903 ggctaTATAtgtatata ID NO: 427 SEQ (+) 1.000 0.709 atatacaTATAtagcctta ID NO: 428 SEQ (−) 1.000 0.816 ttgttttTAAGgctata ID NO: 429 SEQ (+) 1.000 0.899 agcctTAAAaacaaaga ID NO: 430 SEQ (+) 1.000 0.973 aaAACAaagat ID NO: 431 SEQ (+) 1.000 0.863 ttaaaaaCAAAgattgt ID NO: 432 SEQ (+) 1.000 0.811 aagattgtAATTttt ID NO: 433 SEQ (−) 1.000 0.900 acaatttaTAAAaattacaatct ID NO: 434 SEQ (−) 1.000 0.844 tttataaaAATTaca ID NO: 435 SEQ (−) 1.000 0.956 atttaTAAAaattacaa ID NO: 436 SEQ (+) 1.000 0.875 tgTAATttttataaatt ID NO: 437 SEQ (−) 1.000 0.903 aatttaTAAAaattaca ID NO: 438 SEQ (−) 1.000 0.973 tcacaatttaTAAAaattacaat ID NO: 439 SEQ (−) 0.750 0.798 atcACAAtttataaaaa ID NO: 440 SEQ (+) 1.000 0.927 ttttaTAAAttgtgatt ID NO: 441 SEQ (−) 1.000 0.997 aaaAATCacaattta ID NO: 442 SEQ (+) 1.000 0.806 gTGATttttaaaa ID NO: 443 SEQ (−) 1.000 0.923 tattttttTAAAaatcacaattt ID NO: 444 SEQ (+) 1.000 0.990 ttgtgattttTAAAaaaataaac ID NO: 445 SEQ (+) 1.000 0.905 gtgattttTAAAaaaataaacct ID NO: 446 SEQ (−) 0.755 0.796 gGTTTatttttttaaaa ID NO: 447 SEQ (+) 0.750 0.775 gatttttaAAAAaataaacctgc ID NO: 448 SEQ (+) 1.000 0.848 aaacctgcATTAtcttc ID NO: 449 SEQ (−) 0.884 0.851 gaagaTAATgcaggt ID NO: 450 SEQ (−) 1.000 0.853 tgctgaagaTAATgcaggttt ID NO: 451 SEQ (−) 1.000 0.993 tgaaGATAatgca ID NO: 452 SEQ (+) 0.867 0.951 TGAAtgttcct ID NO: 453 SEQ (+) 1.000 0.893 cctAAGTtttgta ID NO: 454 SEQ (+) 1.000 0.772 agttttgTAGAacttga ID NO: 455 SEQ (−) 1.000 0.927 cgtgtCAAGttctac ID NO: 456 SEQ (−) 1.000 0.997 tctgccaCGTGtcaagt ID NO: 457 SEQ (+) 1.000 0.795 aggattTTAGtctacac ID NO: 458 SEQ (−) 1.000 0.981 gatgCAGGtgtagac ID NO: 459 SEQ (−) 1.000 0.832 ctgtcctcagatgcaGGTGta ID NO: 460 SEQ (+) 1.000 0.873 ctaacaGGAAaggagac ID NO: 461 SEQ (+) 1.000 0.829 ggagacaCATGtgtggtag ID NO: 462 SEQ (+) 1.000 1.000 catgtgTGGTagttc ID NO: 463 SEQ (+) 1.000 0.919 tgtggtagTTCCcag ID NO: 464 SEQ (−) 1.000 0.841 aactgGGAActac ID NO: 465 SEQ (−) 1.000 0.842 aaacTGGGaactacc ID NO: 466 SEQ (−) 0.750 0.784 ttcacgtCAAAactg ID NO: 467 SEQ (−) 1.000 0.830 ttcACGTcaaaac ID NO: 468 SEQ (+) 1.000 0.985 cagttttGACGtgaaaagtcc ID NO: 469 SEQ (+) 1.000 0.891 gttttgaCGTGaaaagt ID NO: 470 SEQ (+) 1.000 0.909 ttgACGTgaaaag ID NO: 471 SEQ (+) 1.000 0.890 tttgACGTgaaaagt ID NO: 472 SEQ (+) 1.000 0.837 ttgacgtGAAAagtc ID NO: 473 SEQ (+) 1.000 0.937 cattcttactGGAAacctc ID NO: 474 SEQ (+) 0.800 0.805 ttcttacTGGAaacctc ID NO: 475 SEQ (+) 0.782 0.791 acctCCCTgaatccatgccaagc ID NO: 476 SEQ (−) 1.000 0.964 gctTGGCatggattcaggg ID NO: 477 SEQ (+) 1.000 0.820 tccATGCcaagcact ID NO: 478 SEQ (+) 1.000 0.787 gCCAAgcactacccatcaccttgac ID NO: 479 SEQ (−) 1.000 0.954 cagtCAAGgtgat ID NO: 480 SEQ (−) 1.000 0.800 ctTATGccagtcaag ID NO: 481 SEQ (−) 1.000 0.862 agtgcTTATgccagt ID NO: 482 SEQ (−) 1.000 0.920 atcaaAGGAaatgagtg ID NO: 483 SEQ (−) 1.000 0.969 ggggcatCAAAggaaat ID NO: 484 SEQ (−) 1.000 0.912 gaggGAGGggcat ID NO: 485 SEQ (−) 0.876 0.920 tgagGGAGgggcatc ID NO: 486 SEQ (+) 1.000 0.973 tattaTAAAagcacagt ID NO: 487 SEQ (−) 1.000 0.700 gaaagagacgaCTGTgctt ID NO: 488 

We claim:
 1. A method for identifying agents which modulate INGAP expression comprising: contacting a host cell comprising a reporter construct having at least one INGAP regulatory region selected from nucleotides from the group consisting of SEQ ID NO: 1, 2, 23, 32, 35, 37, 28, 24, 25, 26, 27, 29, 30, 31, 33, 34, 38, and 36 and a region encoding a detectable product with a test agent; determining expression of the detectable product in the cell; and identifying the test agent as a modulator of INGAP expression if the test agent modulates expression of the detectable product in the cell.
 2. The method of claim 5 wherein the regulatory sequence comprises nucleotides 1-3137 of SEQ ID NO:
 2. 3. The method of claim 1 wherein the reporter construct further comprises: a promoter element interposed between the regulatory region nucleotide sequence and the nucleotide sequence encoding the detectable product.
 4. The method of claim 3 wherein the promoter element is selected from SEQ ID NO:
 2. 5. An in vitro method for identifying agents which modulate INGAP expression, comprising: a. contacting a reporter construct having at least one INGAP regulatory region and a nucleotide sequence encoding a detectable product with a test substance under conditions sufficient for transcription and translation of said nucleotide sequence; determining expression of the detectable protein or nucleic acid product; and identifying the test substance as a modulator of INGAP expression if the test substance modulates expression of the detectable product.
 6. The method of claim 5 wherein the regulatory region nucleotide sequence comprises of one or more regions chosen from nucleotides from the group consisting of SEQ ID NO: 1, -2, 23, 32, 35, 37, 28, 24, 25, 26, 27, 29, 30, 31, 33, 34, 38, and
 36. 7. The method of claim 6 wherein the regulatory sequence comprises nucleotides 1-3137 of SEQ ID NO:
 2. 8. The method of claim 5 wherein the reporter construct further comprises: a promoter element interposed between the regulatory region nucleotide sequence and the nucleotide sequence encoding the detectable product.
 9. The method of claim 8 wherein the promoter element is selected from SEQ ID NO:
 2. 10. A method for inducing INGAP expression in a mammal in need thereof, comprising administering to the mammal an effective amount of a factor that stimulates INGAP expression in the said mammal.
 11. The method of claim 10 wherein the factor that stimulates INGAP expression was identified by: contacting a host cell comprising a reporter construct having at least one INGAP regulatory region and a region encoding a detectable product with a test agent; determining expression of the detectable product in the cell; and identifying the test agent as a modulator of INGAP expression if the test agent modulates expression of the detectable product in the cell; wherein the regulatory region nucleotide sequence comprises of one or more regions chosen from nucleotides from the group consisting of SEQ ID NO: 1, -2, 23, 32, 35, 37, 28, 24, 25, 26, 27, 29, 30, 31, 33, 34, 38, and
 36. 12. The method of claim 11 wherein the factor that stimulates INGAP expression was identified by: contacting the SEQ ID NOS: 1, -2, 23, 32, 35, 37, 28, 24, 25, 26, 27, 29, 30, 31, 33, 34, 38, and 36 or fragments thereof with a test agent; determining binding of the test agent to the nucleic acid; and identifying the test agent as a potential modulator of INGAP expression if the test agent binds to the nucleic acid
 15. The method of claim 10 wherein the factor that stimulates INGAP expression is selected from hLIF or PMA. 